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Preface 



The AAECC symposium was started in June 1983 by Alain Poli (Toulouse), 
who, together with R. Desq, D. hazard, and P. Camion, organized the first con- 
ference. At the beginning, the acronym AAECC meant “Applied Algebra and 
Error Correcting Codes” . Over the years this meaning has shifted to “Applied 
Algebra, Algebraic Algorithms, and Error Correcting Codes”. One reason was 
the increasing importance of complexity, particularly for decoding algorithms. 
During the AAECC-12 symposium, after a long discussion and a vote, the con- 
ference committee decided to hold the next symposium in Hawaii, with Shu Lin 
as Chairman. This vote also led to the decision to enforce the theory and practice 
of the coding side as well as the cryptographic aspects. Algebra is conserved as 
in the past, but slightly more oriented to finite fields, complexity, polynomials, 
graphs. The conference committee was modified, passing from 15 to 20 members. 
For AAECC-13 the main subjects covered are : 

— Modulation and codes : communication systems. 

— Combinatorics : graphs and matrices, designs, arithmetic. 

— Cryptography. 

— Codes : iterative decoding, decoding methods, turbo decoding, block codes, 
convolutional codes, codes construction. 

— Codes and algebra : algebraic curves, Groebner bases and AG codes. 

— Algebra : rings and fields, Galois group, differential algebra, polynomials. 

Six invited speakers characterize the outlines of AAECC-13 : 

— Hui Jin and Robert McEliece (“RA Codes Achieve AWGN Channel Capac- 
ity”). 

— Bernd Sturmfels (“Monomial Ideals”). 

— Michael Clausen (“A Near-Optimal Program Generator for Almost Optimal 

eft’s”). 

— Tadao Kasami (“On Integer Programming Problems Related to Iterative 
Search Type Decoding Algorithms”). 

— G. David Forney, Jr. (“Codes on Graphs: A Survey for Algebraists”). 

— Ian Blake (“Applications of Curves with Many Points”). 

Except for AAECC-1 (Discrete Mathematics, 56,1985) and AAECC-7 (Dis- 
crete Mathematics, 33,1991), the proceedings of all the symposia have been pub- 
lished in Springer- Verlag’s Lecture Notes in Computer Science series (vol. 228, 
229, 307, 356, 357, 508, 673, 948, 1255). It is a policy of AAECC to maintain a 
high scientific standard. This has been made possible thanks to the many refer- 
ees involved. Each submitted paper was evaluated by at least two international 
researchers. 
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Preface 



AAECC-13 received 86 submissions; 42 were selected for publication in these 
proceedings while 33 additional works will contribute to the symposium as oral 
presentations. 

The symposium was organized by Marc Fossorier, Hideki Imai, Shu Lin, and 
Alain Poli, with the help of Stephen Cohen and Marie-Claude Gennero. 

We express our thanks to the Springer- Verlag staff, especially to Alfred Hof- 
mann and Ruth Abraham, as well as to Jinghu Chen, Ivana Djurdjevic, Yu Kou, 
Qi Pan, Tsukasa Sugita, and Heng Tang, for their help in the preparation of 
these proceedings. 



September 1999 M. Fossorier, H. Imai, S. Lin, A. Poli 
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Codes on Graphs: A Survey for Algebraists 



G. David Forney, Jr. 

Massachusetts Institute of Technology, Cambridge MA 02139 USA 
f orneySlids .mit . edu 



Abstract. There is intense current interest in the subject of “codes 
on graphs.” Graphical models for codes arise from state realizations. 
Normal realizations may be assumed without loss of generality. A nor- 
mal realization has a dual which necessarily generates the dual code. 
Conventional trellises and tail-biting realizations are inherently normal. 
Efficient realizations of Reed-Muller codes are presented. Low-density 
parity-check codes use generic parity-check realizations. Algebraic tools 
may improve such graph-based codes. 



1 Introduction 

The subject of “codes on graphs” was founded by Tanner [TanSl], who was 
inspired by Gallager’s low-density parity-check (LDPG) codes [Gal62] . 

In recent years graphical models and their associated decoding algorithms 
have come into prominence as a common intellectual foundation for the study of 
capacity-approaching codes such as turbo codes and LDPG codes [AM98,KFL98]. 

This paper is a survey of “codes on graphs” for readers with a background in 
algebraic coding theory. There are many interesting problems in this emerging 
field for which algebraic tools may be helpful, both in constructing efficient 
representations of known codes and in finding new families of graph-based codes. 

1.1 Codes, Realizations, and Graphs 

A block code C is any subset of .4 = ^keiA symbol index set /yi 

is any finite discrete set, not necessarily ordered, and the symbol configuration 
space A is the Gartesian product of any collection of symbol variables Ak,k G Ia- 
The code thus consists of a subset of configurations a = {ofc, fc G I a} G A, called 
valid configurations, or codewords. 

A group code is a code C with a componentwise group property; i.e., the 
symbol configuration space A = Ak is the direct product of a collection 

of symbol groups Ak,k G I a, and the code is a subgroup CCA. Any linear code 
is a group code. 

In coding for memoryless channels, the ordering of the symbol index set I a 
is immaterial; two codes that differ only by a permutation of I a have the same 
distance profile and the same performance on a memoryless channel, and are 
usually taken to be equivalent. We may use this freedom to order I a however we 
like, or even to let I a be unordered. In the field of “codes on graphs,” the “time 
axis” I A is represented by a graph rather than by a subinterval of the integers. 
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A state realization of a code C involves two further types of elements: 

1. A collection of state variables (“state spaces”) Sj,j e Is] 

2. A collection of local codes or local constraints Ci,i G Ic, 

where Is and Ic are finite discrete symbol and constraint index sets, respectively. 
The state configuration space is S = Sj. In general, the three index sets 

Iaj Is and Ic need not bear any relationship to one another. 

Whereas symbol variables are specified a priori as part of the code descrip- 
tion, state variables are introduced by the designer into a realization to improve 
it in some sense. Symbol variables are visible, whereas state variables are hidden. 
In coding, symbol variables are transmitted over a channel and are observed in 
the presence of noise, whereas state variables are part of the internal represen- 
tation of the code and are unobserved. 

Each local code Ci is a subset of a certain local Cartesian product set, 

a c (g) Afc X (g) S',- = A{{) X S(i), 

where /. 4 (i) C /_4 and Is{i) Q Is are the subsets of the corresponding index sets 
that are involved in the local code Ci. The local code thus defines a set of valid 
local codewords (a|/^(j), S|/5(i)) = {{ak,k G {s,-, j G /^(f)}} G Ci. 

The degree of a local code Ci is defined as the number of symbol and state 
variables that are involved in Ci. Conversely, the degree of a symbol or state 
variable is defined as the number of local codes in which it is involved. 

In a group realization, each symbol variable Ak and state variable S, is a 
group, and each local code Ci is a subgroup of the direct product A{i) x 5(*). 

The full behavior B of a, state realization is the subset of all configura- 
tions (a, s) G A X 5 such that all local constraints are satisfied; i.e., such that 
(a|/yi(i), S|/5(i)) G Ci for all i G Ic. In a group state realization, B is a subgroup 
of A X 5. 

The code C that is realized by such a state realization is the projection 
C = B |_4 of the full behavior B onto the symbol configuration space A] i.e., the 
set all a G A that occur as part of some valid configuration (a, s) G B. In a 
group state realization, C is a subgroup of A] i.e., a group code. 

A normal realization is a state realization in which all symbol variables 
have degree 1 and all state variables have degree 2. In other words, each symbol 
variable is involved in precisely one local code, and each state variable is involved 
in precisely two local codes. 

A normal graph is a graphical model of a normal realization with the 
following elements: 

1. Each symbol variable Ak is represented by a leaf edge of degree 1; 

2. Each state variable Sj is represented by an ordinary edge of degree 2; 

3. Each local code Ci is represented by a vertex; 

4. An edge is incident on a vertex if the corresponding state or symbol variable 
is involved in the corresponding local code. 
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A normal graph is thus a graph with leaves] i.e., a simple hypergraph in which 
edge degrees can be only 1 or 2. 

A normal realization may be characterized by the properties of its associated 
normal graph; e.g., we may say that a realization is “connected” or “cycle-free.” 

It is shown in [For98] that any state realization can be “normalized” to 
become a normal realization; i.e., given any realization of a code C, one can find 
an equivalent normal realization of C. 

A group normal realization of a group code C may be “dualized” by replacing 
each state or symbol variable by its character group, replacing each local code 
by its dual (orthogonal) code (in the sense of Pontryagin duality), and inverting 
the sign of each state variable in one of the two local codes in which it is in- 
volved. In the case of linear codes over a field F, symbol and state variables are 
unchanged by this procedure, and local codes are replaced by their dual codes in 
the usual sense. Moreover, if the field has characteristic 2, then sign inversions 
are unnecessary. The dual realization theorem of [For98] shows that such a dual 
realization necessarily realizes the dual code C-^. 



1.2 Decoding Codes on Graphs 

From a practical viewpoint, the most important property of a graphical model is 
that it specifies a family of associated “fast” decoding algorithms, called generi- 
cally the sum-product algorithm. See [AM98,KFL98,For98] for expositions of this 
algorithm. The essential features of the sum-product algorithm are local compu- 
tations at each vertex using the sum-product update rule, and “message-passing” 
of results along edges. 

Two principal variants of the sum-product algorithm can perform APP decod- 
ing, which computes the a posteriori probability of every value of every symbol 
and state variable, or ML decoding, which computes the maximum-likelihood 
value and its likelihood for every symbol and state variable. 

On a cycle-free graph, the sum-product algorithm completes in a time of 
the order of the diameter of the graph, and is exact. On a graph with cycles, 
the sum-product update rule may still be used locally, but the global algorithm 
becomes iterative and approximate. Nonetheless, it often performs very well; 
e.g., it is the standard decoding algorithm for capacity-approaching codes such 
as turbo codes and low-density parity-check codes. 



2 Examples 

Examples will now be given of four types of normal realizations: conventional 
state realizations (trellis realizations), tail-biting realizations, realizations of Reed- 
Muller codes, and parity-check realizations such as are used for LDPC codes. 
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2.1 Conventional State Realizations (Trellises) 

A conventional state realization of a code C — i.e., a trellis realization — is 
obtained by ordering the symbol index set and identifying it with a subinterval 
of the integers; e.g., = [0, n). Notice that the ordering of Ij\_ is a design choice. 

The state index set Is is taken to be essentially the same time axis, e.g., 
Is = [0,n]. The state spaces Sk,k e Is are generally chosen to be as small as 
possible. For group codes, the State Space Theorem [FT93] defines an essentially 
unique minimal state space Sk for each time k, namely the quotient group 



C|fc- X C|fc+ ’ 

where and C|fc+ are the subcodes of C that are supported entirely on the 
“past” k~ = [0,fc) or on the “future” = [k,n), respectively. 

In particular, the initial and final state variables may always be taken to be 
trivial unary variables: i.e., |S'o| = jAnj = 1. 

The local code index set is also the same, e.g., Ic = [0,n). The local codes 
Ck specify the configurations (sk,ak, Sk+i) G Sk x Ak x Sk+i that can actually 
occur; i.e., which state transitions (sfc,Sfc+i) G Sk x Sk+i can occur, and which 
(output) symbols Uk G Ak may be associated with each possible transition. In 
other words, the local codes specify trellis sections. 

Thus every symbol variable Ak is involved in one local code, namely Ck, and 
every state variable Sk other than or S'„ is involved in two local codes, namely 
Ck and Ck-i. Actually Sq and are not involved in the realization, because a 
function of a unary variable does not actually depend on that variable. Therefore, 
excluding Sq and Sn, a conventional state realization is a normal realization. 

Figure 1 depicts the normal graph of a conventional state realization of a 
block code of length 6. State variables other than So or Sq are represented 
by ordinary edges, symbol variables are represented by leaf edges (indicated 
by a special symbol), and local constraints (trellis sections) are represented by 
vertices. 




Figure 1. Normal graph of a conventional state realization (trellis). 

A conventional state realization is evidently cycle-free, so the sum-product 
algorithm is exact and completes in a finite time of the order of the length of 
the code. In this case the sum-product algorithm for APP decoding is known 
as the “forward-backward” or “BCJR” algorithm, while for ML decoding it 
is equivalent to the Viterbi algorithm [AM98,KFL98]. The complexity of the 
algorithm is proportional to the maximum state space size, or more precisely to 
the maximum local constraint size (“branch complexity”). 

In the past decade there has been a great deal of research into minimizing the 
trellis complexity of block codes, which for linear codes is essentially a matter of 
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finding the best ordering of the symbol index set Ij\,. This research is summarized 
in [Var98]. For many of the best-known families of codes, such as Reed-Muller 
codes and the (24, 12, 8) Golay code, the optimal coordinate ordering is known. 

If the trellis of Figure 1 is a minimal realization of a linear code C for a given 
ordering of /^, then the dual realization theorem implies that the dual trellis 
shown in Figure 2 is a minimal realization of the dual linear code C'^ for the 
same coordinate ordering. (The small circles in Figure 2 represent sign inverters.) 
This result not only confirms the well-known fact that dual codes have identical 
minimum state complexity profiles in a given coordinate ordering [For88], but 
also gives an explicit specification for the dual trellis, as found previously by 
Mittelholzer [Mit95]. 




Figure 2. Dual trellis, realizing dual code. 



2.2 Tail-Biting Realizations 

A tail-hiting realization of a code C is obtained by identifying the symbol index 
set /yi with a circular time axis, which may be represented by the cyclic group 
Z„ of integers modulo n. Again, the mapping from Ij^ to is a design choice. 

The state index set Is and the local code index set Ic are also taken as 
Z„. Again, the local codes Ck specify trellis sections; i.e., the configurations 
{sk,ak, Sk+i) € Sk X Ak X Sk+i that can actually occur, where the indices are 
evaluated in Z„ (z. e., modulo n). Thus again every symbol variable Ak is involved 
in one local code Ck, and every state variable Sk is involved in two local codes, 
Ck and Ck-i- Therefore a tail-biting realization is a normal realization. Figure 
3 shows the normal graph of a tail-biting realization of a block code of length 8. 




Figure 3. Normal graph of a tail-hiting realization (trellis). 

Given a tail-biting realization of a group code C , the dual realization theorem 
again implies that the dual code C'"’" is realized by the dual tail-biting realization, 
which in particular has the same state space sizes. 

Again, the state spaces Sk should be chosen to be as small as possible. On a 
graph with cycles, no definite minimum for |S'fc| can be given in general. However, 
there is a useful lower bound based on cut sets. 
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A cut set of a connected graph is a minimal set of edges such that removal of 
that set partitions the graph into two disconnected subgraphs. The symbol vari- 
ables are correspondingly partitioned into two disjoint subsets, say “past” and 
“future.” The Cut Set Lower Bound [WLK95,CFV99] lowerbounds the product 
of the sizes of the state spaces corresponding to the cut set by the minimum size 
of the state space in any conventional trellis for the same code that has the same 
“past” and “future” symbol subsets. 

In a tail-biting trellis, every cut set comprises two edges. For two diametrically 
opposite edges, the Cut Set Lower Bound implies that the product of the two 
corresponding state space sizes cannot be less than the minimum state space 
size S'mid at the midpoint of any conventional trellis for the same code. This in 
turn implies that the minimal maximum state space size in any tail-biting trellis 
is at least 

For example, for the (24,12,8) Golay code, it is known that S'mid = 256, 
so the minimal maximum state space size is lowerbounded by 16. In fact, there 
exists a 16-state tail-biting trellis for this code [CFV99]. Sum-product decoding 
is inexact, but on an additive white Gaussian noise (AWGN) channel is within 
0.1 dB of ML decoding performance. 

In this example and in general, the Cut Set Lower Bound shows that dramatic 
reductions in state complexity are possible in realizations with cycles. On the 
other hand, in cycle-free realizations, every edge is a cut set, and therefore the 
size of every state space is lowerbounded by the size of some state space in some 
conventional trellis. Therefore most research on “codes on graphs” has focused 
on realizations with cycles, even though sum-product decoding is then inexact. 

2.3 Realizations of Reed-Muller Codes 

Reed-Muller (RM) codes are an infinite family of binary linear block codes that 
include many of the best shorter codes. RM codes are known to have rather 
simple conventional state realizations (trellises) [For88,Var98]. 

The Reed-Muller code RM(r, m) of length 2™ and minimum distance d = 
2 m-r jg (ijefined as the code generated by the rows of the 2™ x 2™ binary 
Hadamard matrix that have weight d or greater. In other words, the code- 
words are given by a = AiJ 2 ™ , where the components of the information vector 
A corresponding to rows of weight less than d are zero, and the remaining com- 
ponents are free. 

This definition leads to an efficient normal realization for every RM code. 
For example. Figure 4 is the graph of a realization of the (8,4,4) RM code. 
The codeword a is represented by the 8 binary symbol variables at the top. The 
information vector A, which is not visible, is represented by the 8 binary state 
variables at the bottom, 4 of which are constrained to be zero and 4 of which are 
free (indicated by “0” and “F” constraints, respectively). The remainder of the 
realization enforces the relation a = AHs with a “fast Hadamard transform,” 
whose basic building block is a 2x 2 Hadamard transform. All state spaces (edges) 
are binary, and all local constraints are either (3, 1, 3) repetition codes or (3, 2, 2) 
single-parity-check (SPC) codes, indicated by “=” or “-I-,” respectively. 
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Figure 4. Realization o/(8,4,4) Reed- Muller code. 

The dual realization is obtained by interchanging zero and free constraints, 
as well as repetition and SPC constraints. Starting from a graph for an RM(r, m) 
code as above, dualization produces a graph which is the left-right mirror image 
of the graph for the RM(m — r — 1, m) code. The dual realization theorem then 
implies the well-known result that these two codes are duals. 

For any particular code, the generic realization above can be reduced by 
propagating the effects of the free and zero constraints upward in the realization. 
For example, a reduced realization of the (8, 4, 4) RM code is shown in Figure 5. 




Figure 5. Reduced realization o/(8,4, 4) RM code. 

These realizations are very efficient, with binary state spaces and of the order 
of nlog 2 n constraints of the simplest type, namely (3, 1, 3) or (3, 2, 2). However, 
in general they have cycles, which degrade decoding performance. For example, 
on an AWGN channel, decoding of the (8,4,4) code using the realization of 
Figure 5 is about 0.2 dB inferior to ML decoding, and for larger codes the 
performance loss increases rapidly. 

Cycles may be removed by clustering ( “sectionalization” ) . For example, with 
the clusters indicated by dotted lines in Figure 5, the cycle-free realization of the 



G. David Forney, Jr. 



(8, 4, 4) code shown in Figure 6 is obtained. All state and symbol variables are 
now quaternary. The rectangular constraints represent isomorphisms between 
quaternary variables, which require no computation in a sum-product update. 
The two constraints denoted by “(6, 3)” represent (6, 3) binary linear codes. 




Figure 6. Cycle-free realization o/(8,4,4) code. 



This realization is in fact equivalent to the standard 4-state sectionalized 
trellis realization of this code with 2-bit sections. Similar cycle-free realizations 
with a similar binary tree structure can be constructed for all RM codes. Their 
maximum constraint complexity ( “branch complexity” ) is always equal to that of 
the minimal standard trellis realization for the same code. However, they specify 
exact ML decoding algorithms that are typically somewhat more efficient than 
Viterbi algorithm decoding of the standard trellis, because of their “divide-by-2” 
tree structure. In fact, these algorithms are precisely those that were proposed 
for ML decoding of RM codes in [For88] . 

As a final simple example, the reduced realization of the (8, 7, 2) SPC code 
is shown in Figure 7, as well as the standard 2-state trellis. Both realizations in- 
volve 6 (3, 2, 2) constraints, so both have the same “space complexity.” However, 
because of its binary tree structure, the diameter of the former realization is less, 
so its “time complexity” is less (21og2n/2 vs. n — 2, in the general case). This 
shows how the complexity of even such an apparently irreducible realization as 
a 2-state SPC trellis can be reduced by a more general graphical model. 




0 0 0 0 0 0 



-L (a) -L 



(b) 



Figure 7. Realizations o/(8,7,2) code: (a) tree; (b) trellis. 



2.4 Low-Density Parity-Check Codes 

A binary linear block code C may be specified by an n x r parity-check ma- 
trix , as follows: C = {a S F 2 I aiL^ = 0}. The normal graph of such a 
parity-check realization is bipartite, and is shown in Figure 8(a). It is essentially 
equivalent to a Tanner graph [Tan81]. The dual bipartite graph, representing 
a dual generator-matrix realization of a code C'^ with generator matrix 77, is 
shown in Figure 8(b). 
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Figure 8. Realizations based on (a) parity-check matrix; (b) generator matrix. 

Low-density parity-check (LDPC) codes, originally proposed by Gallager 
[Gal62], are codes in which the matrix (and therefore the adjacency matrix 
in the graph of Figure 8(a)) is sparse. Although such a graph has cycles, their 
girth can be made large. Sum-product decoding performance closely approaching 
channel capacity has now been demonstrated with long LDPG codes [RSU99] . 

For very long blocks, a pseudo-random choice of appears to be satis- 
factory. For moderate block lengths of the order of 1000 bits, however, it is 
expected that algebraic techniques may be helpful in specifying an optimal . 
There may also be some advantage to using nonbinary “states” (edges). 
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Abstract. In ref. [3] we introduced a simplified ensemble of serially con- 
catenated “turbo-like” codes which we called repeat- accumulate, or RA 
codes. These codes are very easy to decode using an iterative decoding 
algorithm derived from belief propagation on the appropriate Tanner 
graph, yet their performance is scarcely inferior to that of full-fledged 
turbo codes. In this paper, we prove that on the AWGN channel, RA 
codes have the potential for achieving channel capacity. That is, as the 
rate of the RA code approaches zero, the average required bit Eb/No 
for arbitrarily small error probability with maximum-likelihood decod- 
ing approaches log 2, which is the Shannon limit. In view of the extreme 
simplicity of RA codes, this result is both surprising and suggestive. 



1 Introduction 

In ref. [3] we introduced a class of turbo-like codes, the repeat and accumu- 
late (RA) codes, which are simple enough to allow a fairly complete theoretical 
analysis, yet powerful enough to perform nearly as well as full-fledged turbo 
codes. The general idea is shown in Figure 1. An information block of length 
k is repeated q times, scrambled by a pseudorandom permutation (interleaver) 
of size qk, and then encoded by a rate 1 accumulator. The accumulator can be 
viewed as a truncated rate-1 recursive convolutional encoder with transfer func- 
tion 1/(1 J- U), but we prefer to think of it as a block encoder whose input block 
[xi, . . . , Xn] and output block [yi, . . . , y„] are related by the formula 

yi = xi 

U2 = Xi-[- X2 

V3 = Xi -\- X2 + X3 



yn = Xi -\- X2 + X3 -\ h X„. 

In other words, the outputs of the accumulator are the mod-2 partial sums of 
the inputs. An RA code as just described is therefore a {qk, k) linear block code, 
but because of the unspecified interleaver, there are a large number, {qk)\ to be 
precise, RA codes with these parameters. 

* This work was supported by NSF grant no. CCR-9804793, and grants from Sony 
and Qualcomm. 
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rep. q 



qk 



□ 



qk 



acc. 



qk 



Fig. 1. Encoder for a {qk, k) RA code. The “rep. g” component repeats its k- 
bit input block q times; the “P” block represents an arbitrary permutation of 
its qk-hit input block; and the “acc.” is an accumulator, whose outputs are the 
mod-2 partial sums of its inputs. 



For the theorist, the most important thing about RA codes is that their 
combinatorial properties are reasonably well understood [3], [4]. For the practi- 
tioner, the most important thing is the experimentally verified fact [3] that if an 
iterative decoding algorithm derived from belief propagation on the appropriate 
Tanner graph is applied to them, their AWGN channel performance is scarcely 
inferior to that of full-fledged turbo codes. In this paper, we will give a partial 
explanation of the latter property, by using the former property to show that on 
the AWGN channel, RA codes have the potential for achieving channel capacity. 
That is, as the rate of the RA code approaches zero, the average required bit 
Eb/No for arbitrarily small error probability with maximum-likelihood decod- 
ing approaches log 2, which is the Shannon limit. The remaining problem, of 
course, is to explain why the low-complexity but suboptimal iterative decoding 
algorithm performs so well. 

2 An Ensemble AWGN Coding Theorem for RA Codes 

Our main theorem deals not with any particular RA code, but with the average, 
or ensemble performance, as the block length approaches infinity. Therefore we 
begin with a brief description of what we mean by an ensemble of codes. 

An ensemble of linear codes is a sequence , C„ 2 , . . . of sets of linear codes 
of a common rate R, where Cm is a set of (ni,ki) codes with ki/ui = R. We 
assume that the sequence ni, ri 2 , . . . approaches infinity. If C is an (n, k) code 
in the ensemble, the weight enumerator of C is the list 

Ao(G),Ai(C),... ,A„(G), 

where Ah{C) is the number of words of weight h in C. The average weight 
enumerator for the set C„ is defined as the list 

where 

=*' |7^ ^h{C) for = 0, 1, . . . , n. 



( 2 . 1 ) 
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Also, we define the spectral shape of C„, 

rn{S) ^ log , for 0 < (5 < 1, (2.2) 

and the ensemble spectral shape : 

r{S) lim rn{5) for 0 < i5 < 1, (2.3) 

n — »-oo 



where we are implicitly assuming that the limit exists. 

As a special case, consider the ensemble of rate R = 1/q RA codes. Here 
the sequence ni, ri 2 , u-a . . . is q, 2q, 3q, . . . , and the set Cqk consists of {qk)l linear 
(n, k) block codes, all with n = qk. In [3] [4] it was shown that the spectral shape 
for this ensemble of codes is given by the formula 



r„(<5) = max < f{u,S) + 

0<n<min(2(5,2 — 2(5) 



H{u) 



where 



f{u, 6) -H{u) + (1 - 



(2.4) 

(2.5) 



and H{u) = —{ulogu + (1 — w)log(l — u)) is the (natural) entropy function. 
Figure 2 shows the function rq{S) for q = 4. 
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In [4], using a bound derived in [2], we proved that if we define, for each 
integer q> 2, 



Co{q) 



def 



sup 

0 < 6<1 



1 - ( 5 1 - 

2 



( 2 . 6 ) 



and 

79=%co((7), (2.7) 

then if g > 3, and E^/Nq > 7 ^, as n 00 , the average maximum-likelihood 
word error probability for the ensemble of rate l/q RA codes approaches 0. A 
short table of these thresholds, together with the corresponding AWGN Shannon 



limit, is given below. 

q 


R 


Iq (dB) 


Shannon (dB) 


2 


1/2 


3.384 


0.184 


3 


1/3 


0.792 


-0.495 


4 


1/4 


-0.052 


-0.794 


5 


1/5 


-0.480 


-0.963 


6 


1/6 


-0.734 


-1.071 


7 


1/7 


-0.900 


-1.15 


8 


1/8 


-1.015 


- 1.210 


00 


0 


(-1.592) 


-1.592 



For example, the q = 3 line of this table tells us that if Ei,/Nq > 0.792 dB, then 
as n ^ 00 , the word error probability for the ensemble of rate 1/3 RA codes 
approaches zero.^ On the other hand, for E^/Nq < —0.495 dB, (the Shannon 
limit for binary codes of rate 1/3) as n —> 00 , the word error probability for any 
sequence of codes of rate 1/3 must approach 1. 

In the table, and more clearly in Figure 3, we see that as the rate i? ap- 
proaches zero, the Shannon limit approaches log 2 = —1.592 dB. This is of 
course well known, and in fact the value —1.592 dB is usually referred to as the 
Shannon limit for the AWGN channel. The interesting thing for us, however, is 
that the RA code thresholds also seem to approach —1.592 dB. This empirical 
observation is in fact true, and it is the object of this paper to prove the following 
theorem. 



Theorem 21. We have lim^^oo Iq = log 2, i.e., RA codes achieve the Shannon 
limit for the AWGN channel. 



We will give a proof of Theorem 21 in Section 3. But here we note that it is 
easy to prove that the limit cannot be smaller than log 2 :^ 

^ We have included the threshold for q = 2 in the table because it can be shown that 
for rate 1/2 RA codes, if Ei,/Nq > 72 , the ensemble bit error probability approaches 
zero, although the word error probability does not. 

^ Of course, Theorem 22 follows from the fact that log 2 is the Shannon limit for the 
channel, but it is interesting to have an “elementary” proof. 
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Theorem 22. With the threshold 7^ dehned as above, liminfg^oo 7g > log 2. 
Proof: First note that in (2.5), we have f{u, 1/2) = 0, so that 

^g(l/2) = max H{u)/q = H{l/2)/q = 

0<u<l q 

Thus by taking i5 = 1/2 in the definition (2.6) of co{q), we have the lower bound 



co{q) > 



1- (1/2) l-e-2’-<j(i/2) 
(1/2) 2 



1 _ g-21og2/g 



2 



It therefore follows that 



79 = qco{q) > 



1 _ e-(2/«) log 2 



The desired result now follows by observing that the limit, as g ^ 00, of the 
right side of (2.8) is log 2. □ 
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3 Proof of Main Theorem 



In this section, we present the proof of Theorem 21. We begin with some pre- 
liminary technical results which concern the spectral shape functions rq(S). 

Lemma 31. The function f{u,6) defined in (2.5) is nonpositive, i.e., for any 
(u, (5) G [0, 1] X [0, 1], f(u, (5) < 0. Furthermore, f(u, <5) = 0 if and only if <5=1/2 
or u = 0. 



Proof: Jensen’s inequality, and the fact that H(u) is strictly convex D, implies 
that for any S G [0, 1], and any two numbers U\,U 2 G [0, 1], we have 

H(Sui -h (1 — S)u 2 ) > SH(ui) -h (1 — 5)H(u2), 

with equality if and only if u\ = U 2 and/or <5 = 0 or 1. Letting ui = u/(2S) and 
U 2 = u/(2(l — i5)), we obtain the desired result. □ 



Corollary 32. We have, for each q> 2 and S G [0, 1], 

rg(<5) < , 



with equality if and only if 6 = 1/2. 

Proof: Let u(q, S) denote the optimizing value of u in the computation of rq(6), 
i.e.. 



u(q,S) =arg max \ f(u,q) + . (3.1) 

0<if<min(2<5,2— 2(5) Q ) 

Then by the definition of rq(S), we have 

rq(S) = f(u(q, (5), (5) -h H(u(q, 5))/q. 

But since by Lemma 31, f(u,S) < 0, we have 

rq(6)<H(u{q,6))/q<^^. (3.2) 

For equality to hold in (3.2), we must have, for u = u(q, S), both f(u, <5) = 0 and 
u = 1/2. But by Lemma 31, if f{u,S) = 0, then either u = 0 or <5 = 1/2. Thus 
equality can hold only when 6 = 1/2, as asserted. □ 

Lemma 33. Ifu(q,S) is defined as in (3.1), then 

u(q, <5) • (1 - u(q, S)y~^ < (^2 a/ J(1 - <5)^ . (3.3) 
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Proof: It is easy to check that the “max” in (2.4) does not occur at either 
endpoint (at u = 0 the function is zero and its derivative is infinite, and at 
u = min(2i5, 2(1 — (5)), the function is negative), so at the point u = u{q, <5), the 
partial derivative of the function in braces on the right side of (2.4), with respect 
to u, must be zero. This condition works out to be 

-(1 - l/l) log(l - m) - (1/q) logrt + (1/2) log(2 - 25 - u) + (1/2) log(2i5 - m) = 0, 
or in exponential form. 



y^{2-26 -u){26 -u) 
(1 — 



(3.4) 



But since 0 < m < min(2<5, 2(1 — 5)), the numerator on the left side of (3.4) is 
< 2^S{1 — d). Therefore at the maximizing point u = u{q, S), we must have 



2y/j(r^ 

(1 -■u)l-l/9'ul/<? - 



(3.5) 



Rearranging this, we get (3.3). 



□ 



Corollary 34. If {6q) is a sequence of real numbers in the range [0,1], and 
limq^oo Sq = 6* y 1/2, then lim^^oo u{q, Sq) = 0. 

Proof: If (5* yf 1/2, then 2y^6*{l — (5*) < 1, and so the right hand side of (3.3) 
approaches zero as q ^ oo. Thus by (3.3), u{q,S) must approach either zero or 
one. But since u < min(2i5*,2 — 2i5*) < 1, the only possibility is u{q,Sq) ^ 0. □ 



Corollary 35. There exists a qo > 2 and (5q > 0 such that if q> qo and S < Sq, 
then u{q, 5) < <5^. 



Proof: Let i5 < 1/2. Then u < min(2<5, 2 — 25) = 25, so that the left side of (3.3) 
is lower bounded by the quantity ^(1 — 25)^^. Thus from (3.3), we obtain 



u{q,5) < 



(2j5y^V 
\ 1-2^ ) ■ 



(3.6) 



If the quantity in parenthesis in (3.6) is less than one, which is true for 5 < 0.14, 
then the right hand side is decreasing in q. For example, if (? > 6, we have 



u{q,5) < 



/ 2y/j(r^ 

1-25 



6 



(3.7) 



which, being of order 5^, will plainly be < 5"^ for small enough 5. □ 

In the definition of co(q) in (2.6), let 5(q) be the optimizing value of 5, i.e.. 



5(q) = arg sup 

0 < 5 <! 



1 - (5 1 - 

2 



(3.8) 



The following proposition is the key to the proof of Theorem 21. 
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Proposition 31. 



lim S{q) = 1/2. 

q — »-oo 



Proof: We will show that the set {<5(<z)} can have no accumulation points in [0, 1] 
except 1/2. We begin by proving an inequality, eq. (3.10), below. By definition. 



Iq = qcoiq) 

1 - S{q) 1 - 

2 • 

But since we have the elementary inequality 

(1 - e-20/2 < r, (3.9) 

it follows that 

79 < 

But as noted in (3.2), qrg{6) < H{u{q,S)), so that 

-fq ■ H{u{q,S{q))). (3.10) 

Now we assume that there is a subsequence of q’s for which the S{q)’s ap- 
proach a limit S* yf 1/2. There are two cases to consider, S* yf 0, and 6* = 0. 
In both cases, we have from Corollary 34 that u{q,S{q)) — > 0. Thus if 6* yf 0, it 
follows from (3.10) that with q restricted to the given subsequence, 

lim jg < ^ f H{0) = 0, 

q->-oo 0 * 



which contradicts Theorem 22. 

On the other hand, if <5* = 0, we have from Corollary 35 that for q large enough 
u{q,6{q)) < S{q)^. Thus from (3.10) again, for large enough q, we have 



79 < (1-^(9)) 



H{S{qr) 

S{q) 



But since i?(x^)/a; — > 0 as x ^ 0, with q restricted to the given subsequence. 



lim jg < 

q—*oo 



lim 

q—^oc) 



HjSjqr) 

S{q) 



= 0 , 



again contradicting Theorem 22. 
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We have therefore shown that the only possible accumulation point of the 
set {(5(g)} is 1/2, which proves that the limit of the S{q)’s exists and equals 1/2. 
□ 

We can now prove the main theorem: 

Proof of Theorem 21. We have, for every q, 



Iq = qco{q) = q 



l-S{q) 1 - 

~W) 2 



Applying the elementary inequality (3.9), we obtain 



7« < 9 



1 - ^(g) 

S{q) 



rq{S{q)). 



Now from Corollary 32, we know that rg{S) < {log2)/q, so that we have 

1 - 6{q) 



7g < 



S{q) 



log 2. 



But by Proposition 31, lim^^oo S{q) = 1/2, so 

limsupy, < log 2. 



But we saw in Theorem 22 that liminf^yg > log 2. Therefore 

lim 7 „ = log 2. 

q—*oo 



□ 
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Abstract. Grobner basis theory reduces questions about systems of 
polynomial equations to the combinatorial study of monomial ideals, 
or staircases. This article gives an elementary introduction to current 
research in this area. After reviewing the bivariate case, a new corre- 
spondence is established between planar graphs and minimal resolutions 
of monomial ideals in three variables. A brief guide is given to the litera- 
ture on complexity issues and monomial ideals in four or more variables. 



1 Introduction 

A monomial ideal M is an ideal generated by monomials ‘ in ^ 

polynomial ring K[xi,X 2 , . . . ,x„]. Monomial ideals are ubiquitous in the study 
of Grobner bases. For instance, if / = (x"* -I- — 1, x"^ + y"^ — 2) then its initial 

ideal with respect to the total degree term order equals M = (x^, xy”^, y^°). 

The ideal I has 28 distinct complex roots, corresponding to the 28 monomials 
x*y-^ not in M, that is, to the 28 lattice points under the staircase depicting M: 



X 




Fig. 1. The monomial ideal M = (x^, x^y"^, xy^, y^°), with its generators {white 
circles), standard monomials {black dots), and irreducible components {shaded 
circles). 

At any stage in Buchberger’s algorithm for computing Grobner bases, one 
considers the S-pairs among the current polynomials and removes those which 
are redundant [7]. The minimal S'-pairs define a graph Gm on the generators of 
any monomial ideal M. Our aim is to study this graph, and to give an elementary 
introduction to recent work on computing the following objects associated to M: 
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i. the Hilbert series of M, i.e. the formal sum of all monomials not in M; 

ii. the minimal resolution of M by free modules over the polynomial ring; 

iii. the primary decomposition and the irreducible decomposition of M. 

For a first introduction to these problems see [10, §9.2] and [11, Exercises 
3.8 and 17.11]. Their importance for Buchberger’s Algorithm was emphasized 
by Moller and Mora in [16]. Research papers describing effective algorithms are 
[4] and [6] for problem (i), [9] and [14] for problem (ii), and [8] for problem (iii). 

Our point of view differs from these sources. We regard (i), (ii), (iii) as the 
same problem and we develop combinatorial tools for presenting structured solu- 
tions. It is our belief that this approach will ultimately lead to faster algorithms. 
This paper is organized as follows. The easy solution for n = 2 variables will be 
reviewed in Section 2. In Sections 3 and 4 we present the generalization to the 
case of three variables. We shall see that the S-pair graphs Gm for n = 3 are 
precisely the planar graphs. Section 5 summarizes what is known for n > 4. 



2 Two Variables 

Let us begin by answering the three computational questions for our example. 
1. The Hilbert series of the monomial ideal M = {x^, xy"^ , ) equals 

1 — x^ — x^y^ — xy"^ — -I- x'^y'^ + x^y'^ + xy^^ 



(1 -x)(l -j/) 



(2.1) 



This expression equals the sum of the 28 standard monomials in Figure 1. 

2. The minimal free resolution of M is the following exact sequence of modules. 



0 



[x, yf K[x, y]'^ M 



0 . 



where, in matrix notation, d\ = 



[ y* 0 0 ^ 

— xy^ 0 



and (9o = (- 



( 2 . 2 ) 

“)• 



y 0 0 -X J 

3. The ideal M is (x, y)-primary and its irreducible decomposition equals 

M = (x^,y^) n {x^,y'^) fl (x,y^°). (see Figure 1) (2.3) 

It is not difficult to extend this to an arbitrary monomial ideal in two variables, 
M = (x“iy^L where oi > • • • > 0 ^. and 6i < • • • < . 



Proposition 1. The following holds for any monomial ideal M in two variables: 
1. The Hilbert series of M equals 



: x^y^ ^ M} = 



1 - 

(l-x)(l-J/) 



• (2.4) 
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2. The minimal free resolution of M equals 

0 — > K[x,yY~^ ^K[x,yY -^M — > 0, (2.5) 

where is the canonical map and d\ is given on standard basis vectors by 
di{eY = ybi+^-b^ . ei - . e,+i. 

3. The irreducible decomposition of M equals 

M = {y'^Y {x'^Yy^Y n {x°-Yy^Y n ••• n {x°-''~Yy^Y i"* 

( 2 . 6 ) 

where the first or last component are to be removed if b\ = 0 or Ur = 0. 

This implies that for monomials in two variables only consecutive S'-pairs matter. 

Corollary 1. When applying Buchberger’s Algorithm to r polynomials in two 
variables, it suffices to form the r—1 consecutive S-pairs, instead of all ( 2 ) pairs. 



3 Three Variables: The Generic Case 



This section gives a combinatorial introduction to the results in the article [3] . 
The Hilbert series of any monomial ideal M = (mi, m 2 , . . . , mr ) in the poly- 
nomial ring K[xi, . . . , x„] can be written using Inclusion-Exclusion as follows. 
For / C {1, 2, . . . ,r} let m/ denote the least common multiple of {mi : z G /}. 

Proposition 2. The Hilbert series of M equals 

{xYxlf ■■■xY Y M} = „ I r- (-l)'^'-w/. (3.1) 




Unfortunately this formula is useless for our problem (i), because the number of 
summands is an exponential function in r. But almost all summands cancel. 

In this section we show that for n = 3 the true number of summands in the 
numerator of the Hilbert series of M is a linear function in r. We write x, y, z 
for the variables. To simplify our discussion, we first assume that r > 4 and M 
is generic in the following sense: if m^ = x“*t/*'*z°* and mj = x°‘^y^^z'^^ are 
minimal generators of M neither of which is a power of a single variable, then 
Ui Y Uj, bi Y bj, Ci Y Cj. In Section 4 we shall remove this hypothesis. 

We define a graph Gm with vertices 1,2,... , r by declaring {z < j} an edge 
whenever mij = lcm(mi, m^) is not a multiple of m^ for any fc G {I,--- u}\{bj}- 
Let Tm be the set of triples {i < j < k} which form a triangle {z, j}, {z, k}, {j, k} 
in Gm, and which do not consist entirely of powers x°'f y^^ , z'^’’ of the variables. 
Consider the free module K[x, y, with basis { : {z, j, k} G Tm}, the free 

module K[x,y, z]^‘^ with basis { e^- : {i,j} G Gm}, and the linear maps 



(9i : K[x, y, z]^’^ K[x, y, zY , 
di : K[x, y, ^ lK[a;, y, z]^^“ , 



Cij ^ 



^ij k ' ^ 



YUj ’ ^3 rui ' 

'^ijk 'f^ijk _|_ ‘^ijk 

rujk 3k rriik rriij 



-13 



(3.2) 
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Theorem 1. Let M he a generic monomial ideal in 2 ;]. 

1. The Hilbert series of M equals 

{I - x){l - y){l - z) 

2. The minimal free resolution of M equals 

0 — > K[x, 2/, -^K[x, y, 2;]*^" z]’’ — > 

3. If M is artinian then the irreducible decomposition equals 



(3.3) 



0. (3.4) 



M 



n 



.deg^(myfc)^ ydeg^(m 



jk) 



jk) ^ 



(3.5) 



{i j',fc}eTM 



Here M is called artinian if some power of each variable is in M, or equivalently, 
if the number of standard monomials is finite. The non-artinian case can be 
reduced to the artinian case by considering M + (x™, y™, z™) for 0. 

We illustrate Theorem 4 for the generic artinian monomial ideal 



1 2 3 4 5 6 7 8 9 10 11 12 

(a;^°, z^°, x®y®z, x^y^z"^ ,x*y^z^ , x’^y^z'^, xy^z^, x^y"^z^, x^y'^z’^ , x^yz^ ,x‘^y‘^z^) 

= (x^°,y^°,z) n (a;®,i/^°, z^) fl (x^,y®,z®) D {x^,y^,z’^) fl (a;"^, z”^) fl 

(xi°,y3,z®) n {x^,y‘^,z^) fl {x^,y'^,z^) fl (x^, y®, z®) n y, z^°) fl 

(x^,y®,z^°) n (x,y^°,z^°) fl (x^,y^,z^) fl (x®,i/®,z^) fl (x^,y^°,z®) fl 

(x"‘,i/®,z®) n (x^,y‘^,z^) fl (x^°,i/®,z2) fl (x®,y2,z^°) 




Fig. 2. A generic monomial ideal with 12 generators {white circles), 30 min- 
imal S-pairs {black dots/edges), and 19 irreducible components {shaded cir- 
cles/triangular regions). 
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The right picture shows the graph Gm- Its triangles correspond to irreducible 
components; for instance, the last component (x®, y'^, arises from the trian- 
gle {3, 11, 12} G Tm- The numerator of the Hilbert series has 62 = Id- 12-1-30-1- 19 
summands, one for each vertex, edge and region of the planar graph Gm- 

We are now ready to establish the connection promised in this paper’s title. 

Theorem 2. The S-pair graph Gm of a generic monomial ideal M in three 
variables is planar. If M is artinian then {Gm,Tm) is triangulation of a triangle. 

Proof sketch. It suffices to consider artinian ideals M, since one can throw in 
high powers x™ , j/"* , z™ , compute the S-pair graph, and then obtain Gm by 
deleting any edges involving x™, j/*”, or z™. We may further assume that each 
other generator of M is divisible by xyz. The idea is to construct a 3-dimensional 
polytope whose edge graph equals Gm- Our claim follows because edge graphs of 
3-polytopes are planar. For instance. Figure 2 is the edge graph of an icosahedron. 
Fix a large real number t ^ 0. Let Pt denote the convex hull of the points 
, t^' ,t'^') G as {oi,bi,Ci) runs over the exponents on the minimal generators 
^aiybi^ci Each such point is a vertex of Pt, and the edge graph of Pt is 

independent of t. We call Pt the hull polytope of M. One shows that every edge 
E of Gm is an edge of the hull polytope by producing a linear functional which 
is minimized on Pt along E (this does not use genericity of M). Finally, the 
genericity of M is used to show that every edge of Pt is obtained in this fashion. 

Theorem 2 gives an answer to our questions (i), (ii), (iii) even if M is not 
generic, but that answer is typically nonminimal. (For a minimal solution see 
Section 4). The answer is found by deforming the non-generic ideal M to a 
“nearby” generic ideal M^, where e represents a small positive real. For instance, 

M = (x^, xy, xz, yz, z^) deforms to = (x^, x^+'^y, xz^“*^, y^, y^“'"'^z, z^). 



We can apply Theorem 1 to the generic ideal and then set e = 0 to get the 
Hilbert series, a choice of minimal S-pairs, and the irreducible decomposition of 
M. For instance, the irreducible decomposition of the deformation equals 



M, = (x",y,z^-^) n (x^+%y",z^-^) n (x,y",z) n (x,yl+^z2\ 



We invite the reader to draw the graph Gm^ and see what happens for e ^ 0. 

The preceding discussion allows Theorem 2 to provide the following complex- 
ity result for all monomial ideals in K[x, y, z]. Recall that a planar graph Gm on 
r vertices has at most 3r — 6 edges and at most 2r — 5 bounded regions. 

Corollary 2. 

1. The numerator of the Hilbert series of an ideal generated by r monomials in 
three variables x,y,z has at most 6r — 10 summands. 

2. When applying Buchberger’s Criterion to r polynomials in three variables, it 
suffices to form at most 3r — 6 S-pairs instead of all (Q possible S-pairs. 
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4 Three Variables: The Non-generic Case 

In this section we outline a proof for the following solution to (i), (ii), and (iii): 

Theorem 3. Every monomial ideal M in K[x, y, z] has a minimal resolution by 
the bounded regions of a planar graph. That resolution gives irredundant formulas 
for the numerator of the Hilbert series and the irreducible decomposition of M. 

For generic monomial ideals M this was established in Theorem 1. Before 
discussing the proof for arbitrary M, we must first explain the meaning of “min- 
imal resolution by a planar graph” G. Suppose the vertices of G are labeled 
by the minimal generators toi, . . .mr of M. Label each edge {i < j} of G by 
mij = lcm(jni,mj). Now G, determines a set T of bounded regions, with each 
region R G T having a certain set of vertices {ii < • • • <it\ on its boundary. We 
define the label mn = lcm(mij, . . . ,mif). Finally, for each region R and each 
edge {t,j} define the sign e{R,ij) to be 0 unless {i,j} is on the boundary of R, 
then 1 if i? is on the left as one goes from i to j and —1 if i? is on the right. 

In analogy with Eq. (3.2), consider the following maps of free modules: 



di 


: K[a;,y,z]'^ - 


K[x,y, zX , 


Gj ^ — - ■ 


rriij 

G 








mj 




d2 


: K[x,y,zY' - 


K[x,y,z]^, 




e{R,ij) ■ 



(4.1) 



which define a complex Fg of free IK [x, y, z] -modules in analogy with Eq. (3.4). 
Theorem 3 says that it is always possible to choose a labeled planar graph G so 
that Fg is exact and minimal, i.e., mtj yf rriR for any edge ij of a region R. The 
graph G will be far from unique for monomial ideals M which are not generic. 

Example. Consider the power of the maximal ideal {x,y,z), that is, 

M = (cc,y,z)"‘ = {x^y^z^ : i, j,k G'H, i + j + k = m) . (4.2) 

The staircase of M is depicted in Figure 3(a) for m = 5. The graph in Figure 3(b) 
represents a free resolution of M for m = 5. This graph is essentially the edge 
graph of the hull polytope Pm of M, as defined in Section 3. In fact, the hull 
polytope makes sense for any monomial ideal (in any number of variables) and 
gives a nice but nonminimal solution to our problems (i), (ii), and (iii): 

The resolution defined by the graph in Figure 3(b) is nonminimal because 
every downward-pointing triangle has the same label as all three of its edges. 
We get a graph G satisfying Theorem 3 by deleting precisely one edge from each 
downward-pointing triangle. When an edge is deleted, the two adjacent regions 
join together, as the one with the larger label “swallows” the one with the smaller 
label. Notice how many distinct graphs G result that minimally resolve M ! 

Since the monomial ideal M is invariant under the symmetric group on x, y, z, 
we may ask whether there exists a minimal planar graph resolution which is 
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Fig. 3. Symmetric resolutions of various powers of the maximal ideal (x, y, z). 



invariant. The answer is “yes” if and only if m is congruent to 0 or 1 modulo 3. 
Examples of these symmetric choices are shown for various m in Figure 3: (c) is 
m = 3; (d) is, without the dotted edges, m = 6; (e) is m = 4; and (f) is m = 7. 

The general construction for the case m = 0 (mod 3) is gotten by putting 
together m^/9 of the graphs in Figure 3(c) in the pattern of Figure 3(b), erasing 
the dotted edges as in Figure 3(d) when everything is put together. In the case 
m = 1 (mod 3), one radiates upside-down side-length 2 triangles outward from 
the center, cutting them off at the boundary as in Figures 3(e,f). Of course, the 
upside-down side-length 2 “triangles” are really hexagonal regions in the graph. 

Proof Sketch of Theorem 3. As in the generic case, we may assume that M is 
artinian. We can always find some planar graph G' such that Fg' is a (possibly 
nonminimal) free resolution of M, e.g. by using the hull polytope or by deforming 
to nearby generic monomial ideal. Since the labels on the vertices of G' are 
minimal generators, the matrix d\ never has ±1 entries. The matrix 82 has ail 
entry if and only if some region has the same label as one of its boundary edges. 
One shows that the graph obtained from G' by deleting this edge and joining 
the two adjacent regions still defines a free resolution of M. A graph G satisfying 
Theorem 3 is obtained when there are no nonminimal edges left to delete. 

Once we know that Fq is a minimal free resolution then the irredundant 
irreducible decomposition of M can be read off the labels on the regions of G, in 
analogy with Eq. (3.5). The Hilbert series of M can be written down in analogy 
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with Eq. (3.3). None of the monomials in the resulting formula for the numerator 
cancel. This very last fact relies on n < 3 and may be false for n > 4 variables. 

We claim that the following converse to Theorem 3 holds. 

Theorem 4. For every connected planar graph G there exists a monomial 
ideal M in 3 variables which is minimally resolved by the bounded regions of G. 

This result is a variant of Steinitz’ Theorem [19, Theorem 4.1] which states 
that 3-connected planar graphs are the edge graphs of 3-dimensional convex 
polytopes. If G is a planar triangulation then M can be chosen generic, and, in 
this case. Theorem 4 follows from Schnyder’s Theorem on order dimension [17, 
Theorem 6.2.1, pp. 128], as explained in [3, §6]. The general non-triangulated 
case is more difficult. It does not immediately follow from its order- theoretic 
analogue due to Brightwell and Trotter [17, Theorem 6.3.1]. The complete proof 
of Theorem 4 is “under construction” and will be published elsewhere. 

In Figure 4 is a non-trivial example illustrating the encoding of a planar graph 
G by a monomial ideal M. Note that the order 8 (dihedral group) symmetry of 
the graph cannot be reproduced in the monomial ideal. The square is realized 
by having an irreducible component which is determined by four surrounding 
generators, two of which have one coordinate in common. Similarly, the hexagons 
have six generators spread around the corresponding irreducible component, with 
each such generator sharing one coordinate with the irreducible component. 
Only the artinian components — those with generators on all three sides — define 
regions in G. After M has been chosen to realize G, it may be that G can be 
altered while still remaining a minimal resolution for M. For instance, edge a 
can have its exterior vertex changed from 1 to 2, or even to 3, making the left 
hexagon into a heptagon or octagon. Independently, edge b can have its vertex 
4 moved to 5, making the right hexagon into a pentagon. What must remain 
constant are the numbers of vertices, edges, and regions. 




Fig. 4. A monomial ideal constructed to have the given graph as minimal reso- 
lution. 
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5 Four and More Variables 

The ideas presented in Sections 3 and 4 make perfect sense also for n > 4 
variables. There is the hull polytope Pm in M", and there are deformations 
to generic monomial ideals M M^. Resolutions by planar graphs can be 
generalized to resolutions by cell complexes, as defined by Bayer and Sturmfels 
[5]. The familiar duality of planar graphs now becomes Alexander duality for 
monomial ideals as defined by Miller [15], and yields an efficient algorithm for 
problem (iii). However, it is more difficult to construct cellular resolutions which 
are minimal. There seems to be no n-dimensional analogue to Theorem 3. One 
obstruction is that the minimal resolution may depend on the characteristic of 
the field K. Any general formula for minimal resolutions is expected to involve 
homological algebra over the Icm-lattice of Gasharov, Peeva and Welker [12]. See 
the recent work of Yuzvinsky [18] for a possible approach. 

Cellular resolutions provide efficient formulas for Hilbert series and related 
questions even if they are nonminimal. The following complexity result holds. 

Theorem 5. Fix n > 2. The numerator of the Hilbert series of an ideal M 
generated by r monomials in n variables can have order rL?J many terms for 
r ^ 0. The same upper and lower bound holds for the number of irreducible 
components of M and the ranks of the modules in a minimal resolution of M. 

This upper bound was derived by Bayer, Peeva and Sturmfels [3, Theorem 
6.3] from the Upper Bound Theorem for Convex Polytopes [19, Theorem 8.23]. 
The Upper Bound Theorem states, roughly speaking, that the maximum num- 
ber of faces of an n-dimensional polytope with r vertices has order The 

matching lower bound for Theorem 5 was established by Agnarsson [1]. 

There seems to be no analogue to Theorem 4 is higher dimensions. For in- 
stance, there is a triangulation of a tetrahedron with three interior vertices [3, 
§6] which does not support a minimal resolution of a monomial ideal M in 
R = K[a, b, c, d]. A detailed study of the graphs Gm for n = 4 appears in recent 
work of Agnarsson, Felsner and Trotter [2] . The following example is instructive: 

M = {a^b^c^,d^,a%'^c^d,a'^b^c^d^,a^b^c^d'^, 

ab^cU^, afb^c'^d^, a%c^d\ a’^b^cd^, aH^c^d^). 

Its S-pair graph Gm is the complete graph on 12 vertices. The minimal resolution 
of M is cellular and looks like 0 ^ ^ ^ ^ ^ M ^ 0. It is a 

triangulation of a tetrahedron with 8 interior vertices, which is neighborly in the 
sense that any two vertices form an edge. Polytope combinatorics [19, §8] tells 
us that such a neighborly triangulation has 53 tetrahedra and 108 triangles. 

Let (ffn) denote the maximum of generators of a monomial ideal M in n 
variables whose S-pair graph Gm equals the complete graph. It is a non-trivial 
fact that (j){n) is actually finite. However, it grows doubly-exponentially in n; 
see [17, Theorem 7.2.13]. Note that Corollary 1 implies <f{2) = 2, Corollary 2 
implies </>(3) = 4, and (f{4) = 12 is attained by the neighborly monomial ideal 
M above. The specific values in Table 1 are derived from the following theorem. 
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Theorem 6. (Hosten and Morris [13]). The number 4>(n) equals the number of 
distinct simplicial complexes T on the set {1,2,... , n— 1} with the property that 
no two faces of T have their union equal to {1,2,... ,n — 1}. 

Table 1 . The maximum number <j){n) of generators of a neighborly monomial ideal. 



variables = n 


2 3 


4 5 6 


7 


8 


generators = </(n) 


2 4 


12 81 2,646 


1,422,564 


229,809,982,112 
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Abstract. Let G be a finite group of order n. By Wedderburn’s The- 
orem, the complex group algebra GG is isomorphic to an algebra of 
block diagonal matrices: GG ~ . Every such isomorphism 

D, a so-called discrete Fourier transform of GG, consists of a full set 
of pairwise inequivalent irreducible representations Dk of GG. A result 
of Morgenstern combined with the well-known Schur relations in rep- 
resentation theory show that (under mild conditions) any straight line 
program for evaluating a DFT needs at least 17(nlogn) operations. Thus 
in this model, every 0(n log n) FFT of GG is optimal up to a constant 
factor. For the class of supersolvable groups we will discuss a program 
that from a pc-presentation of G constructs a DFT D = ®Dk of GG and 
generates an 0(n log n) FFT of GG. The running time to construct D is 
essentially proportional to the time to write down all the monomial (!) 
twiddle factors Dk{gi) where the gi are the generators corresponding to 
the pc-presentation. Finally, we sketch some applications. 



1 Introduction 

This paper is concerned with fast discrete Fourier transforms. From an engineer- 
ing point of view, there are two types of domains: a signal domain and a spectral 
domain. Algebraically, both domains are finite dimensional vector spaces (over 
the complex numbers, say) and, in addition, they are equipped with a multiplica- 
tion which turns both domains into associative C-algebras. The multiplication in 
the signal domain, called convolution, comes from the multiplication in a finite 
group, whereas the multiplication in the spectral domain is closely related to 
matrix multiplication. Fourier transforms isomorphically link these two domains 
and thus link convolution and matrix multiplication. 

To be more specific, let G be a finite group. The set CG := {a|a : G ^ C} of 
all C- valued functions (signals) on G becomes a vector space over (D by pointwise 
addition and scalar multiplication. A natural basis is given by the indicator 
functions (G 9 > 5gh) of the group elements g G G. Identifying each group 

element with its indicator function, GG can be viewed as the C-linear span of 
G, i.e., the span of all formal sums with complex coefficients. The 

multiplication in G extends to the so-called convolution in GG: 

( bhh^ = 

g^G KgG k^G g^G 

Marc Fossorier et al. (Eds.): AAECC-13, LNCS 1719, pp. 29-42, 1999. 
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In this way, CG becomes a C-algebra, the so-called group algebra of G over 
C. For example, if G = G„ = (X | X” = 1) is the cyclic group of order n, 
then (CG can be identified with the polynomial ring C[X] modulo the ideal 
generated by X" — 1 . In this case, convolution in CG means ordinary polynomial 
multiplication modulo the relation X" = 1. If w is a primitive nth root of unity, 
then the factorization X" — I = n;jo(^ — w-l) combined with the Chinese 
remainder theorem shows that CG„ is isomorphic to the algebra 0”Fg^C^^^ of 
n-square diagonal matrices. With respect to natural bases in both spaces this 
isomorphism is described by the classical DFT matrix D = k<n- 

Wedderburn’s structure theorem for split semisimple algebras yields the right 
generalization of the above situation: according to this theorem, the complex 
group algebra CG is isomorphic to an algebra of block diagonal matrices, 

D = : CG — > 

Here, the number h of blocks equals the number of conjugacy classes of G and the 
projections D \, . . . , Dh form a complete set of pairwise inequivalent irreducible 
representations of CG. Recall that a representation of CG of degree / is an 
algebra morphism F\ CG ^ C-^^-^. It is irreducible iff F is surjective. Two 
representations Fj, F 2 of degree / are equivalent, F\ ~ F^, if an invertible matrix 
X exists such that for all a G CG : Fi(a) = XF 2 (a)X~^ . Every isomorphism V is 
called a discrete Fourier transform (DFT). If G is non-abelian, there are infinitely 
many DFTs of CG. However, according to the Skolem-Noether theorem, if V 
and A are DFTs of CG, then there are invertible matrices Xt such that A(a) = 
QkXiDk(a)X/T^ for all a € CG. In the sequel, DFT(G) denotes the set of all 
DFT matrices of G. 

From an algebraic point of view, performing a DFT, CG 9 a D{a), 
amounts to evaluating a full set of pairwise inequivalent irreducible representa- 
tions. In matrix terminology this amounts to multiplying the corresponding DFT 
matrix D by an input vector a := (ag)g^G- The linear complexity L(T>) of the 
DFT matrix is the minimum number of additions, subtractions, and scalar mul- 
tiplications^ to compute the matrix-vector product D • a for arbitrary a G C^^L 
If the program constants are restricted to be of absolute value < 2, then the 
corresponding minimum number L 2 (D) is called the 2-linear complexity of D. 
The linear complexity T(G) of the finite group G is defined by 

L{G) := min{L(D) | D e DFT(G)}. 

Similarly, one defines L 2 {G). Trivially, |G| — 1 < T(G) < 2|G| • (|G| — 1), and 
L{G) < L 2 {G). a theorem by Morgenstern [7] combined with the classical Schur 
relations yield [2]: L 2 {G) > ^|G|log|G|. Thus performing a DFT with only 
0(|G|log|G|) additions, subtractions, and scalar multiplications by small pro- 
gram constants is almost optimal in the L 2 -model. This justifies the name Fast 
Fourier Transform. (For more details on lower complexity bounds, see Chapter 

^ In the classical FFT algorithms these program constants are the so-called twiddle 
factors. 
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13 of [4].) As an example, the Cooley- Tukey FFT of CG, G a cyclic 2-group, 
uses only roots of unity as program constants and is thus essentially 2-optimal 
in the L 2 -model. 

Prior to performing a DFT one has to solve a fundamental problem in rep- 
resentation theory: up to equivalence, all irreducible representations of CG have 
to be generated. In general not an easy task! Even worse: as we are interested 
in a fast evaluation oi D = ®Dk we should choose the representatives Dk in the 
equivalence classes very carefully. In other words, we have to choose the right 
Xk above. 

At least for the class of supersolvable groups, it turns out that choosing the 
right bases is quite easy. Moreover both problems (generating a DFT, performing 
a DFT) can be solved in an essentially optimal way. Furthermore, the results 
lead to fast algorithms not only in theory but also in a practical sense. The aim 
of this paper is to present these results and some of its consequences without 
giving too much technical details. For more information we refer to [1,2,4,5,8,12]. 

The rest of this paper is organized as follows. After some preparations, we 
describe in Section 2 geometrically a monomial DFT for supersolvable groups 
and indicate how this yields FFTs for supersolvable groups. Section 3 presents 
the main ideas of the algorithm that constructs monomial DFTs. Furthermore, 
some implementation details and running times are shown. Section 4 sketches 
some applications. 

2 Monomial DFTs for Supersolvable Groups 

A finite group G is called supersolvable iff there exists a chain 

r = (G = G„ D G„_1 D . . . D Gi D Go = {!}) 

such that each Gi is a normal subgroup in G and all indices [Gi : Gi_i] =: pi are 
prime. Thus T is a chief series of G with chief factors GijGi-i of prime order. 
For example, all nilpotent groups (especially all groups of prime power order) 
are supersolvable. 

In this section we are going to describe the irreducible representations of 
supersolvable groups in a geometric way. This approach will be the theoretical 
basis of the algorithmic approach shown in the next section. We need some 
preparations. 

2.1 Basic Definitions and Tools 

The character x of a representation D of CG is defined by x(^) := Trace(D(g)), 
for g G G. Characters are constant on conjugacy classes and two represen- 
tations are equivalent iff their characters are equal. Characters correspond- 
ing to irreducible representations are called irreducible characters. A charac- 
ter is called linear iff it corresponds to a representation of degree 1. By Irr(G) 
we denote the set of all irreducible characters of G. The space GF(G, C) of 
all complex-valued class functions on G becomes an inner product space by 
(xl^) := |G|“^ SgeG x{9)'4’{9)- For a proof of the following facts we refer to [9]: 
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Theorem 1. For a finite group G with h conjugacy classes the following is true: 

(1) Irr(G) = {xi,--- ,Xh} is an orthonormal basis ofCF{G,<D). 

(2) Let F and Dk he representations of CG with characters x o.'nd Xk, respec- 
tively. If Dk is irreducible, then the multiplicity (Dk\F) with which Dk occurs 
in F equals (xfc|x)- 

(3) If Ck ■■= := ^^j^Y.gaGXk{g~^)g, then ei,... , are a basis of the 

center of (DG. Moreover, 1 = e\ eu and CkCj = SkjCk. (The Ck are 

called central primitive idempotents in CG.J 

(4) If M is a left G,G-module affording the representation F with character x> 
then M = (Bk^iCkM (isotypic decomposition) . If Mk is a simple module 
affording the character Xk, then CkM, the isotypic component of type Xk, is 
isomorphic to the direct sum of {xk\x) copies of Mk. Every simple submodule 
of M affording the character Xk is contained in CkM. 

Let iL be a subgroup of G. Then CiL can be viewed as a subalgebra of CG. 
If G is a representation of CG, then its restriction to CiL is a representation of 
CiL denoted hy D [ H =: F . In turn, D is called an extension of F. Similarly, 
xi H denotes the restriction of the character 

One important tool in constructing representations is the process of induc- 
tion, where a representation of a group G is constructed from a representation of 
a subgroup H. In terms of modules, the construction is straightforward: let L be 
a left ideal in CiL. Then CGL is a left ideal in CG and with G = one 

obtains the decomposition CGL = (Bi^iPiL as a C-space. In particular, CGL 
has dimension \G \ H] ■ dimL. The left CG-module CGL is said to be induced 
by L. A look at the corresponding matrix representations leads to the following 
definition. Let H he & subgroup of the group G, T := (gi, . . . , gr) a transversal 
of the left cosets of in G and let F be a representation of CiL of degree /. 
The induced representation F'fj’G of CG of degree / • r is defined for a; G G by 

FTtG(x) := {F{g-^xgg))^<.,g<r G 



where F{y) := F{y) if y G H, and F{y) is the /-square all zero matrix, if 
y G G \ H. It is easily checked that this defines a representation of CG. Tak- 
ing different transversals gives possibly different, but equivalent representations. 
Thus in non-critical situations we sometimes write F(G instead of FfrpG. Note 
that F(rpG{x), for a: G G, is a block matrix, with exactly one non-zero block 
in each block row and in each block column. In particular, if F is of degree 1, 
then, for all x G G, the matrix F(rpG{x) is monomial. (Recall that a matrix is 
called monomial iff it has exactly one non-zero entry in each row and in each 
column. A representation D of CG is said to be monomial iff D{g) is monomial, 
for all G G.) A group G is called an M-group if every irreducible represen- 
tation is equivalent to a monomial one. Below we will give an alternative proof 
to the well-known fact that supersolvable groups are M-groups. There is a close 
connection between restriction and induction. A more precise statement reads 
as follows. 
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Theorem 2 (Probenius Reciprocity Theorem). Let H he a subgroup of 
G. Furthermore, let F and D he irreducible representations of CR and <CG, 
respectively. Then the multiplicity {F\DIF{) of F in DILI equals the multiplicity 
(D\F^G) ofD in F]G: {F\D[H) = {D\F]G). 

If is a normal subgroup of G and F a representation of CiV, then for g G G 
we define a new representation F^ of (DiV by F^{n) := F{g~^ng) for all n G N. F 
and F® are called conjugate representations. As {f{n)\n G N} = {F^{n)\n G N}, 
F is irreducible iff F® is irreducible, and G acts on Irr(iV) by conjugation via 
gX ■= {N 5 n The last tool needed for our geometric approach is 

the following special case of Clifford theory. 

Theorem 3 (Clifford’s Theorem). Let N be a normal subgroup in G of prime 
index p, and let F he an irreducible representation ofGN. For a fixed g G G\N, 
let T denote the transversal {l,g,g'^ , . . . ,g^~^) of the cosets of N in G. Then 
exactly one of the following two cases applies. 

(1) All F^ are equivalent. Then there are exactly p irreducible representations 

Dq,... , Dp-i o/ CG extending F. The are pairwise inequivalent and 
satisfy F^\G ^ Dq(B .. .(B Dp-i. Moreover, ,x^~^ ore the linear 

characters of the cyclic group G/N in a suitable order, we have Dk = x^®Dq 
for all k, i.e., Dk{x) = x^{xN)Dq{x), for all x G G. 

(2) The representations F® are pairwise inequivalent. Ln this case, the induced 
representation FfG is irreducible. 

To decide which case in Clifford’s Theorem applies, we work with intertwining 
spaces. Recall that for two representations D and Z\ of CG, both of degree d, 
the intertwining space is defined by 

Int(F,Z\) := {X G ^‘^^'^\XD{g) = A{g)X, for all g G G}. 

2.2 A Geometric Approach 

Let T = (G„ A G„_i D ... D Go) be a chief series of the supersolvable group 
G = G„ of exponent e and let x G Irr(G) be a fixed irreducible character of G. 
We are going to associate to x a simple left CG-module M affording x and a basis 
in M such that the resulting representation D is monomial, i.e., D is monomial 
and each non-zero entry of D{g), g G G, is an e-th root of unity. To this end, 
we consider all sequences w = (xo,--- ,Xn = x)> with irreducible characters 
Xi G Irr(Gi). By Theorem 1 (4) and Frobenius Reciprocity the product 

e{w) := • . . . • e^„ 

of the corresponding central primitive idempotents is non-zero iff all multiplic- 
ities (xi|x*+i) •= (XilXj+iiGi) are positive. The set of all those sequences will 
be denoted by W(x)' 

W{x) ■■= {(Xo,- - ■ ,X« = x) I Xi G Irr(Gi), (Xi|x*+i) > 0}- 
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Theorem 4 ([6]). Let G be a supersolvable group of exponent e, x irreducible 
character of G, and T a chief series of G. Then the following holds. 

(1) Let w G W(x)- Then e(w) is a primitive idempotent in CG and CGe(w) is 
a simple <CG -module affording the character x- The dimension of <CGe{w) 
equals x(l) = |W^(x)l- 

(2) Let V G W(x) and set M := (DGe(v). Then M = ® 

decomposition of M into 1-dimensional linear subspaces e{w)M. 

(3) G acts transitively on W{x) by g{xo, ■■■ ,Xn) ■= {gxo, ■ ■ ■ ,9Xn)- 

(4) For all g G G and all w G W{x) we have ge{w)g~^ = e{gw). 

(5) Let U be the stabilizer of v G W(x) and L a transversal of the left cosets of 
U in G. Then {te{v)\l G L} is a G-basis o/ CGe(w) and the corresponding 
representation D of GG is e-monomial. More precisely, if the 1-dimensional 
<CU -module e(w)(DGe(f) affords the linear character X, then e(v) equals the 
central primitive idempotent corresponding to X: e(v) = e\, and D = At^G. 

Proof (1). Cx = (^^<„ l)ex = n*<„(Exieirr(Gd exJex = J2wew(x) Thus 
6x is the sum of x(l) pairwise orthogonal idempotents e{w); thus all e{w) are 
primitive. 

(2) . Let M = CGe(u) and w G W(x). Then e(w) applied to M causes 
successive isotypic decompositions of M along T : ®xo ■ (®xi ■(•■■■ (cxn-i-T^) ■■•))■ 
As (Xi|x*+i) = 1 (by Clifford’s Theorem), e(w)M is a simple CGo-module, hence 
one-dimensional. 

(3) . By Clifford’s Theorem, G acts transitively on the irreducible constituents 
of X i Gn-i. Observing that G„_i acts trivially on Irr(G„_i), an induction on 
n yields our claim. 

(4) . This follows from ge^,g~^ = Cg^i, for all Xi € Irr(Gj). 

(5) . By (3) and (4), G acts transitively on the set of lines {e{w)M\w G W{x)} 

according to ge(w)M = e{gw)gM = e{gw)M. Choosing any nonzero vector 
Xyj G e(w)M yields a basis (xiu)j„^i^(^) of M, and, by (2), the corresponding 
matrix representation is monomial. Now we choose the Xw in such a way that the 
non-zero entries in the representation matrices to the group elements are all eth 
roots of unity. To this end, let G < G denote the stabilizer oiv gW (x) and L a 
left coset transversal of U in G. As for g G G, 0 yf ge{v) = e{gv)ge{v) G e{gv)M 
and G acts transitively on W{x)i the set {ge{v)\g G G} is a C-basis of M. As 
Lf stabilizes the line e{v)M = Ce(u), there exists a linear character A of G such 
that ue{v) = X{u)e{v), for all u G U. Now let e\ = |G|“^ X)«6C/ X{u~^)u G CG 
denote the central primitive idempotent corresponding to A. Then e\e{v) = 
e(v) = e{v)e\, and hence CGe(u) < CGca. Thus e\ = ae(v), for some a G CG. 
But then, e{v) = e\e{v) = ae(v)e(v) = ae(v) = e\. □ 

The above result suggests to introduce the T-character graph of G. This 
graph has n-\- 1 levels. The nodes of level i are the irreducible characters of Gi. 
Edges do exist at most between nodes of adjacent levels. More precisely, there 
is an edge between Xi G Irr(Gi) and x*+i G Irr(Gi+i) iff (xi|Xi+i) > 0 (note 
that (xi|Xi+i) = 1 for supersolvable groups). In addition, it is very convenient 
to know the action of G on each level. For this it suffices to take one element 
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gj G Gj \ Gj-i for each j and specify the action of gn, ■ ■ • , gi+i on Irr(Gi). (Note 
that all gk, k < i, act trivially on Irr(Gi).) Figure 1 in Subsection 3.3 shows the 
character graph of a group of order 128. 



2.3 FFTs for Supersolvable Groups 

According to the last subsection we already know that a supersolvable group G 
of exponent e has an e-monomial DFT D = (Bk=iDk- The construction of the 
Dk was along T. Now we look at such a DFT D from a more algorithmic point 
of view. 

Definition 1. Let T = {G = Gn ... Go = {1}) be a chain of subgroups of the 
finite group G. A representation D o/CG is called T -adapted iff for all 0 < i < n 
the following conditions hold: 

(1) The restriction D[Gi is equal to the direct sum of irreducible representations 
o/CGi, i.e., D[Gi = (BqFiq, with irreducible representations Fiq. 

(2) Equivalent irreducible constituents of DlGi are equal, i.e., if Fiq ~ Fn then 
Fiq = Fit (but not necessarily q = t). 

If D is T-adapted then for all i < n, DlGi is 7(-adapted, where % denotes the 
chain (Gi D ... D Go). It is not hard to show that the above constructed mono- 
mial DFTs for supersolvable groups are in fact T-adapted, see, e.g. [BCS,p.337]. 
Now we can state the main result of this subsection. 

Theorem 5 (Baum [1]). If G is a supersolvable group with chief series T, 
then any T-adapted DFT of CG is monomial and can be evaluated with at most 
7 • |G| • log |G| operations, where 1.5 < 7 < 8.5 depends on the prime divisors of 
\G\. 

Proof. (Sketch) Let [G„ : G„_i] = p and T” = ®Dk a T-adapted monomial 
DFT of CG„. Let Ti, . . . , T- be the distinct irreducible constituents of TJ,G„_i. 
Then is a monomial, 7)i_i-adapted DFT of CG„_i. As copying 

is free in our model, L(T”|G„_i) = L(T”“^). 

Instead of evaluating T" at a G CG„ directly, we rewrite a according to 
the coset decomposition G„ = Lij^pg^Gn-i. Then for suitable aj G CG„_i we 
have a = Hence T”(a) = T”( 5 f^)(T”|G„_i)(aj). This for- 

mula suggests a divide-and-conquer strategy. In the divide-step, we evaluate the 
p “smaller” DFTs D'^~^{aj). By a tricky application of Clifford’s Theorem com- 
bined with local FFTs of size p to handle simultaneously the cases of p extensions, 
the conquer-step is managed in such a way that altogether an 0(|G| log |G|) up- 
per bound is obtained. □ 

According to Theorem 4 and Theorem 5, a T-adapted DFT D = 
of CG is essentially unique. In the sequel, we sometimes write Irrep(G, T) = 
{£> 1 , . . . , Dh} and similarly Irrep(Gi,T)- 
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3 Efficient Construction of Monomial DFTs 

In this section we give a summary of the algorithm in [3] which constructs 
a monomial DFT of a supersolvable group G given by a pc-presentation with 
0(|G| log |G|) operations. One can even show, that the running time is essentially 
proportional to the output length. For a detailed description and analysis of this 
algorithm we refer to [3] . 

3.1 PC- Presentations 

Let G be a supersolvable group with chief series T as above. For 1 < t < n let 
Qi be an element in G* not in Gi-\. With respect to (gi, . . . , (/„) each element 
g € G can be expressed uniquely in normal form 

9 = ■ gl"-i ■ ■■■■ gT (0<6i<pi). 

The multiplication in G is completely described, if the normal forms of all powers 
gf* and all commutators [gi,gj] '■= gi^^gJ^gtgj are known. More formally, every 
supersolvable group has a power-commutator presentation (pc-presentation) of 
the form 

G = {gi,. ■ ■ ,9n\ gf = Ui {1 <i < n), [gi,gj] = Wij {l<i < j < n)), 

with primes pi as well as words Ui S Gi-i and Wij G Gi, all given in normal 
form. Moreover, we require the presentation to be consistent, i.e., that every 
word in the generators has a unique normal form. Consistent pc-presentations 
of this kind exactly describe the class of supersolvable groups. 

With respect to such a pc-presentation, an irreducible representation of 
the group Gi is fully described by the representing matrices of the generators 

gi,-- - ,gi- 

As an example, we give a consistent pc-presentation of a supersolvable group 
with 128 elements denoted by Gi 2 s. In the presentation, trivial commutator 
relations are omitted. 

Gi 28 = {g7,96,g5,g4,g3,g2,gi\g1 = gl = gl = gl = gl = ^,gl = gi,97 = gi, 

[92, 9q] = [92,97] = [93,94] = [53,55] = [53,55] = 51, [53,57] = 52, 

[54,55] = 52 • 51, [54,55] = 53 • 51, [55,57] = 53, [55,57] = 55) 

3.2 The Algorithm 

Before describing the algorithm, we want to mention the following important 
points. First, the pc-presentation of G already contains all the information on 
the group needed in the algorithm, so no group operations are required at all. 

Second, even though the irreducible representations are computed over C, 
it turns out that the algorithm uses just integer arithmetic. Hence, we never 
run into numerical problems! More precisely, all matrices to be processed by 
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the algorithm are e-monomial and all matrix manipulations are multiplications. 
Therefore, we can compute in the additive group which is isomorphic to the 
group of eth roots of unity in C. (One can show that the algorithm works over 
any field K containing such a primitive eth root of unity, but, for simplicity, we 
just consider the case AT = (D.) 

The central idea of the algorithm is based on Clifford’s Theorem. In our 
notation it says that given an irreducible representation F of CGi_i, 0 < i < n, 
then there are two cases: 

Case 1. F extends to pi = [Gi : Gi-i] pairwise inequivalent irreducible repre- 
sentations of CGi of the same degree deg(F). 

Case 2. The induction of F is an irreducible representation of CG^ of degree 
p* • deg(F). 

Furthermore, up to equivalence all irreducible representations of CGj can be 
obtained this way. This allows us to construct the irreducible representations 
of CG iteratively in a bottom-up fashion along the chief series T. However, 
constructing an arbitrary DFT is not what we want. We are interested in the 
construction of a very special set of irreducible representations - namely repre- 
sentations resulting in e-monomial matrices when evaluated at group elements. 
Suppose, we already have constructed a full set of nonequivalent irreducible e- 
monomial representations of CGi_i denoted by F . In order to construct an e- 
monomial D G Irrep(Gj,7i) of level i from a given e-monomial F £ F oi level 
f — 1, we need to know the relation between the conjugate representation F’®* 
and the corresponding F G F with F®* ~ F. That is the reason, why the in- 
tertwining spaces Int(F®‘ , F) come into play. It turns out that all intertwining 
matrices in Int(F®% F) are scalar multiples of an e-monomial matrix. In a second 
phase, the algorithm computes such intertwining matrices. To cut a long story 
short, we now give a summary of the algorithm. At level i the algorithm takes 
the following input: 

Phase 1 . F = Irrep(Gi_i, 7j_i), i.e., a full set of nonequivalent irreducible e- 
monomial representations of CGi_i such that ^p^y^F is Tj-i-adapted. 
Phase 2. For every i— l<j<na permutation ttj of F such that F®^ ~ tt^F 
for all F e F as well as e-monomial matrices Xyp € Int(F®^ , tt^-F), F G F. 

The following output is computed: 

Phase 1 . T> = Irrep(Gj, 7j), i.e., a full set of nonequivalent irreducible e-mono- 
mial representations of CG^ such that 0£)gx) ^ Tj-adapted. 

Phase 2. For every i < j < n a permutation tj of T> such that F®^ ^ TjD for 
all F G F as well as e-monomial matrices Yjp> G Int(F®^ , TjF), D gT>. 

Note that the input of level 0 is trivial, all intertwining matrices being set to 1. 
Level i of the algorithm consists of two phases. 

Phase 1 (Computation of T>). Consider F G F and its (/i-conjugate repre- 
sentation F®b 
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Case 1. F ~ i.e., ttiF = F. Then, by Clifford’s Theorem, there are exactly 

p : = Pi pairwise nonequivalent irreducible extensions Do, .. . , Dp-\ of F to CGi 
satisfying Dk=x^ Dq, where . . . , are the irreducible characters of 

the cyclic group GijGi-\. Since i Gi-\ = F, fc=l,...,p— 1, in this step 
only the Dk{gi) have to be computed. One can show that Dk{gi) € Int(F®^ , F) 
and c^Xfp = F{gf) with a constant c G C*. The last equation has p distinct 
solutions Co, . . . , Cp_i G C*, which can be proven to be even eth roots of unity. 
Thus the desired e-monomial matrices Dk{gi), 0 < k < p, just differ by a factor, 
which is a power of a pith root of unity, and are given by Dk{gi) := CkXip. 

Case 2. F ^ F®% i.e., WiF yf F. Again, by Clifford’s Theorem, the induced 
representation F f Gi is irreducible and (F f Gi) J, Gi_i = 

F^i ^ Trf F, we know the existence of a unique irreducible representation D of 
CG, sucht that D J, Gi_i = ©£ 'p TrfF. This 7j-adapted representation is now 
to be computed. We already know D{gi), . . . ,D{gi-i) from level i — 1. Thus it 
remains to specify D{gi). Here, the intertwining spaces constructed in level i — 1 
are to be used. If Xk := X.^k-ip ■ ■ Xip, 0 < k < p, then 



D{gi) 



X2X~^ 



Z 1 






where Z := XoF{gf)Xp\, as shown in [3]. 

By these two constructions, all irreducible representations of Gi up to isomor- 
phism are obtained, and Phase 1 is complete. In addition, during the construction 
in Phase I a bipartite graph is built up in which F G F and D G D are linked 
if and only if F is a constituent of F J, Gi_i. This “traceback” information is 
needed in the next phase. Furthermore, this information, collected over all levels 
i = 1, .. . , n, is nothing else but the T-character graph of the group G. 

Phase 2 (Computation of Tj and Yjp)- Let F G F and i < j < n. We have 
to consider the same two cases as in Phase 1. 

Case 1. TTiF = F. In Phase 1, the p extensions Dq, . . . , Fp_i have been com- 
puted. As Dk is an extension of F, one can show that TjD must be an extension 
of TTjF. Let Ao , . . . , Ap-i be the extension of tt^F. Then it can be shown that 
Yjp)^, := Xjp must be set for all k and Tj{{Do, . . . , Fp_i}) = {Aq, . . . , Ap-i}. 
Using PjDfc one can determine tj as is explained in [3]. 

Case 2. WiF yf F. In this case, Tj{D) can be immediately determined, since it 
equals the unique A G D such that A J, Gi_i contains tt^-F (this information is 
encoded in the bipartite graph built up in Phase 1). We don’t want to discuss 
here the construction of Yjp>, which is a bit delicate, but refer to [3]. 
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3.3 An Example: G 128 

Figure 1 shows the character graph of the group Gi 2 s given by its pc-presen- 
tation in Subsection 3.1. Each node represents an irreducible character and its 
corresponding irreducible representation. The numbers on the left hand side in- 
dicate the levels and the numbers on the top are the degrees of the corresponding 
reprentations of the top level. To illustrate the above algorithm, we describe the 
construction of the irreducible representation of level 7, denoted by D, corre- 
sponding to the circled node in Figure 1. 



l\d 1111211112 4 4 4 4 4 4 4 




Fig. 1. Character graph of Gi 28 - 



The representation D is induced, let’s say by the representation F of level 6, 
and is constructed in Phase 1 Case 2 of the algorithm. Suppose the algorithm 
has already constructed all data up to level 6 including F and the intertwining 
matrix Xtf- Since p = p 7 = 2, we have D J, Ge = F (B 7 ^ 7 F and D is known at 
the generators gi, - ■ ■ ,ge- We can compute D{g7) by the formula 



D{g7) 



1 



F{g^7)X7^' 




F{gi)X^y 






AiA-i 




X 7 F 




1 



1 



where w denotes an 8th root of unity, e = 8 being the exponent of G 128 . Here we 
have used that Xq is always the identity, X\ = X 7 F is in our specific example 



also the identity matrix of dimension 2. F{g'^) = F{gi) = 

UJ 

computed using the power-relation = g^ of the pc-presentation. 



4 



has to be 
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3.4 Implementation 

The presented algorithm has been implemented in the programming language C. 
The tests were run on an Intel Pentium II with 300 MHz. As we have mentioned 
previously, no field arithmetic is needed but only computations in the additive 
group ^e- For simplicity, we have assumed e to be known, even though one can 
show that this is not necessary. 

The efficiency of the implementation is based on the fact, that e-monomial 
matrices of size N can be multiplied or inverted with only N operations in ^e- 
Since any e-monomial matrix M G can be written in the form 

M = 7rdiag(w“^ , • • ■ , ) 

with a permutation tt G Sn and non-zero coefficients just the 2N 

integers 7 t(1), . . . , tt{N) and oi, . . . ,qn have to be stored for M. For the group 
G and any r G IM define d^{G) := where h denotes the number of 

conjugacy classes of G and c?i, . . . ,dh the degrees of the irreducible characters 
of G. One easily checks, that the running time to write out the result of the 
algorithm, i.e., all matrices Di^k{gi), 1 < i < n, 1 < k < hi {hi denoting the 
number of conjugacy classes of Gj), 1 < ^ < i, is proportional to * ’ d}{Gi), 
which is bounded from above by ' d^{Gi). 

One can show that the number of operations of the algorithm is of this 
magnitude log(|Gi|) • d^{Gi)) with a moderate constant < 20. In this 

sense the algorithm is nearly optimal. The following table shows the running 
times for some small supersolvable groups to construct all the matrices Di^k{gi) 
as above. Here |G| is the order of G, h the number of conjugacy classes of G, o.l. 
the output length of the algorithm (i.e., ' d^{Gi)), r.t. the running 

time in milliseconds and r.t. /o.l. the quotient of the last two quantities. The 
groups in the first three examples are direct products of the symmetric group 
S 3 , the group in the forth example is G 128 from subsection 3.3 and the last 
example is concerned with a Sylow 2-subgroup of the symmetric group Siq. 



G 


|G| 


h 


0 .1. 


r.t. (ms) 


r.t/o.l. 




7776 


243 


13235 


266 


0.020 


iSsr 


46656 


729 


63528 


1125 


0.018 


{SzY 


279936 


2187 


296464 


4250 


0.014 


Gi28 


128 


40 


280 


15 


0.054 


Syl2(^i6) 


32768 


230 


30960 


2156 


0.069 



Of course, the first three groups are of a very simple nature. However, the run- 
ning time of the algorithm does not essentially depend on the complexity of the 
pc-presentation, but mainly on the number and degrees of the irreducible repre- 
sentations constituting the DFT. This is verified by the more complex example 
Syl2(5'i6). Therefore, the actual running times for constructing a monomial DFT 
of CG reflect very well the theoretical result concerning the output length. 
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4 Applications 

4.1 Related Work 

The fast DFT-generation algorithm has been used as a subroutine to solve other 
computational problems. Thiimmel [12] has designed an algorithm that computes 
from a pc-presentation of a finite p-group G in time 0{p- h ■ |G|) its h conjugacy 
classes as well as the character table. Omrani and Shokrollahi [8] have combined 
the fast DFT-generation algorithm with Galois theory to construct a full set 
of irreducible representations of a supersolvable group G over a finite field K, 
char A" / |G|, which is not assumed to contain a primitive eth root of unity. 

4.2 Fast Convolution 

FFT-algorithms allow a fast convolution in the group algebra CG along the 
formula: a-b = D~^{D{a)-D{b)), for a,b G GG. (Note that the linear complexities 
of a DFT D and its inverse do not differ substantially, for \L{D) — L{D~'^)\ < |G|, 
see [2].) Let D = be a DFT of GG and dk the degree of Dk- Then the 

convolution in GG can be performed with at most 2L{D) + L{D~^) + 2 
arithmetic operations. Thus if D, and hence D~^ , allows a fast Fourier transform, 
and d := maxfc dk, then convolution can be done in time 0(|G| log |G| -I- d|G|). 
As 1 < d < |G|^/^, this constitutes a substantial improvement over the naive 
convolution algorithm, which performs this task in time quadratic in the order 
of G. Even in a very special case, a variant of this FFT-based fast convolution 
in GG might shed new light onto a classical problem in computational group 
theory. A sketch of this will be the last topic of this paper. 



4.3 DFT-Based Collection 



As already mentioned, every element a in a pc-presented supersolvable group G 
can be expressed as a normal word: a = g°‘ := • ■ ■ ■ ■ gi^ ■ The normal 

form problem is to compute on input the unique 7 with g°^ ■ g^ = g"' . 

Classical techniques for solving this problem involve various kinds of collection 
processes (see, e.g., [10]) or Hall polynomials combined with interpolation tech- 
niques (see, e.g., [11]). To the best of our knowledge, there is no strategy that is 
always superior to all other strategies. 

As an alternative to classical collection strategies we propose DFT-based 
normalization. To simplify our notation, we start with a pc-presented p-group 
G with corresponding chief series T = (G„ D . . . D Go) and complete lists 
Irrep(Gi,7)) of 7)-adapted e-monomial irreducible representations of GG^. Di g 
always denotes the trivial representation of GGj, Di i always a non-trivial exten- 
sion of satisfying Di^i(gi) = where C is a primitive p-th root of unity. 

On input (a,/?), the algorithm proceeds in n steps {n downto 1) to compute 7. 
After Step i + 1, the numbers 7 „, . . . ,7^+1 are known. To get 7^ in Step i, we 
work in G jGi-\ by replacing g^ by 1, for all j < i. Consider the word 



Wi := 






■9t -dn 
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By definition, Wi G Gi and Wi = g]' modGi-i. We want to compute 
since ji is determined by A,i(wi) = However, since Wi is 

expressed in all generators gi, ■ ■ ■ , g-n, this cannot be done directly at level i. To 
this end we choose a suitable representation F G Irrep(G„, 7^) whose restriction 
to CGi contains Di i as its first irreducible constituent. Then all what remains 
to do is to compute the first position of the diagonal matrix F(wi), which equals 
A.i(iCj) = = C'- As 

F{w,) = • • • F{g,r* ■ F{g^f- ■ ■ ■ F{g,)^^ 

is a product of monomial matrices and we are interested in only one entry of the 
final result, each factor F{gj) causes only one addition in Ze- Altogether, we 
obtain the following. 

Theorem 6. Let G be a pc-presented p-group of order p" and exponent e, with 
corresponding chief series T. Then, given (suitable parts of) the T-adapted DFT, 
normalization of the product of two normal words in G can be done with at most 
2 ■ p ■ additions in . 

Finally, we want to remark that a similar result holds for the normalization of 
any formula in the generators pi, . . . , of G. 
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Abstract. We consider sufficient conditions for ruling out some useless 
iteration steps or all subsequent iteration steps without degradation of 
error performance in soft-decision iterative decoding algorithms for bi- 
nary block codes used over the AWGN channel using BPSK signaling. 
Then the derivation of such sufficient conditions and the selection of cen- 
ters of search regions in iterative steps are formulated uniformly as a type 
of integer programming problems. Several techniques for reducing such 
an integer programming problem to a set of subprograms with smaller 
computational complexities are presented. 



1 Introduction 

A sufficient condition on the optimality of a decoded codeword rules out the 
subsequent iterations in soft-decision iterative decoding algorithms. Such a con- 
dition is a kind of early termination condition and it was first introduced in [1] 
based on a single candidate codeword already generated and then its improve- 
ment was presented in [2]. Furthermore, stronger sufficient conditions, denoted 
Conds,/t, based on at most h candidate codewords already generated have been 
reported in [3,4] for /i = 2 and 3, and in [5] for /i = 4. Conds ,3 was effectively 
used to reduce the number of iterations in a low weight-trellis based iterative 
soft-decision (iterative MWTS) decoding algorithm [6] . A sufficient condition for 
the soft-decision decoding based on ordered statistics was presented in [7,8] for 
ft. = 1, [9] for ft = 2 and formulated in [10] for ft > 2. 

A sufficient condition for ruling out some useless test error patterns for Chase- 
type decoding [11] was first introduced in [12]. A sufficient condition, denoted 
CondRo, for ruling out some useless next test error pattern in Chase-type de- 
coding was derived in [13]. Simulation results [13] for several example codes show 
that this ruling-out condition is much more effective in reducing the number of 
iterations than Conds.i with almost the same computational complexity. 

An early termination condition CondsT is presented in [14] for Chase-type 
decoding algorithm. Simulation results show that CondsT is more effective than 
Conds,i, especially as the number of test error patterns grows. The combination 
of CondsT with Condpto turns out to be very effective. 

* The paper is partially based on reference [14]. 
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The derivation of ruling-out, termination and sufficient conditions and the 
selection of centers of search regions [15] in iterative steps are shown to be a type 
of integer programming problems. Several techniques for reducing such an inte- 
ger programming problem to a set of subproblems with smaller computational 
complexities are presented. 

2 Definitions 

Suppose a binary block code C of length N is used for error control over the 
AWGN channel using BPSK signaling. Each codeword is transmitted with the 
same probability. Let r = (ri,r 2 ,... ,rjv) be a received sequence and let 2 : = 
( 21 , 22 , • • ■ , zn) be the binary hard-decision sequence obtained from r using the 
hard-decision function: Zi = 1 for Vi > 0 and Zi = 0 for < 0 . Any soft-decision 
decoding scheme is devised based on r or reliability information provided by r. 
For the AWGN channel and the BPSK transmission, the reliability of a received 
symbol is generally measured by its magnitude jr^j. 

2.1 Correlation Discrepancy 

For a positive integer n, let P” denote the vector space of all binary n-tuples. For 
u = {ui,U2 , ... ,un) € , the correlation between u and the received sequence 

r is given by M{u) = ri{2ui~ 1). Then M{z) = kd ^ M{u) for any 

u e . Define = {i : Zi, and 1 < i < N} and 

L(m)A ^ \n\ = {M{z)-M{u))/2. (2.1) 

ieD-i(u) 

L{u) is called the correlation discrepancy of u with respect to 2 . For a subset 
U of , let L[17] be defined as L[</>(empty set)] = 00 and 

L\U\ = minL(tt). (2.2) 

uGU 

The assumption that L{u) yf L{v) for any different tuples u and v in will be 
called the assumption of correlation uniqueness. For a positive real number 
e and u ^ v, the probability of \L{u)—L{v)\ < e approaches zero as e approaches 
zero. For a nonempty set U C , let v[U] denote a binary Wtuple u G U such 
that L{u) = L[U]. Under the above assumption, v[U] is unique. The following 
lemma holds. 

Lemma 1. Under the assumption of correlation uniqueness, LffJ'] < L\fJ] for 
4> ^ U CU'C , if and only if v\U'] ^ U. 

The maximum likelihood decoder decodes the received sequence r into the 
optimal codeword Copt for which 



L(copt) = T[C]. 



(2.3) 
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If 2 ; is a codeword, then 2 ; is the optimal codeword. A candidate codeword c is said 
to be better (or more likely) than another candidate codeword d if L{d) < L{d). 
A candidate codeword c is said to be the best if L{c) is the minimum among a 
specified set of candidate codewords. 

2.2 Soft-Decision Iterative Decoding Algorithm 

We consider the following type of soft-decision iterative decoding algorithms. 
Let IDA denote an algorithm of this type for a binary block code C of length 
N. For a received sequence r, if the binary hard-decision sequence z is in C, 
then 2 : is output as the optimal decoded codeword and the decoding terminates. 
Otherwise, IDA starts from the first stage. At the j-th stage, IDA performs the 
following two subprocesses (G) and (T): 

(G) Generation of a Candidate Codeword: A search region at the j-th stage, 
denoted R{j), is searched through to find the best codeword, denoted c(j). If 
CC\R{j) = (/>, then, define c(j) = *. For convenience, we define L{*) = 00 . The 
codeword c(j) (yf *) is called the candidate codeword generated at the j-th 
stage. Define Rp{j), Rf{j) and i?iDA as Rp{0) = (j) and 

RpU) = for j > 1, (2.4) 

RfU) = (Ui>, R{^)) \ RpU), and i?iDA = [jj>iR{j)- (2.5) 

Define Cbest(j) to be the best codeword in C n Rp{j). If C n Rp{j) = </>, then 
define Cbest(j) = *• 

(T) Test of Termination Condition: A termination condition, Gondx, is 
tested. If Gondx holds, then Cbest(j) is output as the decoded codeword and the 
decoding terminates (if is output, IDA fails in decoding). If Gondx is false, 
IDA proceeds to the (j -I- I)-th stage. 

The termination condition Gondx is usually to limit the number of iterations 
by a predetermined way. We will introduce a ruling-out condition and an early 
termination condition into IDA in Section 3. 

3 Ruling-Out and Early Termination Conditions 

3.1 Ruling-Out Condition 

Note that Rp(j') \ Rp{j — 1) with 1 < j < j' denotes the subset of Rp(j') which 
has not been searched before the j-th stage. Let LroO)/) be a lower bound 
such that 

LnoU,f) < UCniRpif) \ RpU - I))]- (3.1) 



Lemma 2. If the following condition holds at the beginning of the j-th stage: 

L{chest{j ~ 1)) < LlroUU )) (3.2) 

then CbestU") = CbestU - 1) for j < j" < j'. 
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Hence, if the condition (3.2) holds, then the subprocesses (G) of the j-th 
through j'-th stages may be skipped, and the condition (3.2) is called ruling- 
out (useless decoding subprocesses) condition. 

3.2 Early Termination Condition 

Let LErp(j) be a lower bound such that: 

(3.3) 



Lemma 3. Suppose the following condition holds: 

L{Cbest{j)) < L^tU)- (3-4) 

Then L{cbest{i)) = L{cbest{j)) for any i > j. 

Hence, there is no improvement on error performance by any further iteration 
if (3.4) holds. The condition (3.4) is called an early termination condition. 

3.3 Sufficient Condition of Optimality 

In case that Rf{j) in (3.3) can not be expressed as a simple form, we can sub- 
stitute simply \ Rp{j) for Rf{j). Let Lgif) be a lower bound such that 

Lsij)<ucniv^\Rpim- (3.5) 

Then, the condition that 

L{chest{j)) < Lg{j), (3.6) 

is a sufficient condition of the optimality of Cbest(j)- 

Next we consider how to derive such lower bounds as Lg(j'), L^o(j) = 
LLRoi.hj') iiET(j) without information on C other than the min- 
imum distance dmin or a few smallest distances in the distance profile of C. 

From (3.1), (3.3), (3.5) and (2.5), the lower bounds can be derived by eval- 
uating L[U], where U is of the following form: 

U = Cn{X\Rp{j)), XCV^, 

= {C\Rp{j))n{X\RpU)). (3.7) 

The second term in the right-hand side of (3.7) is shown in [13,14] and Sec. 4 

for several decoding algorithms. The first term can be expressed as 

C \ Rp{j) = C \ {{JUR{i)) = nLi(C \ R{t)). (3.8) 

For V G and a positive integer d, define 

Od{v) = {a; e : d]i{x,v) < d} and Od{v) = {x G : d]i{x,v) > d}, (3.9) 
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where du{x,v) denotes the Hamming distance between x and v. For most iter- 
ative decoding algorithms, 

C\Rp{j)cr\UOdM), (3.10) 

where if c(i) ^ *, then Ui = c(i) and dt is dmin or a small distance in the distance 
profile of C (see Example 4.6), and if c(i) = *, then the i-th search center (see 
Example 4.3) and [(dmin + 1)/2J can be chosen as Ui and di, respectively. For 
positive integers n, h, di and Ui e V" with 1 < i < h, define 

Kri,d2,...,dJui,U2,... ,Uh) = 

= {u e : d]i{u,Ui) > di, for 1 < i < d}.(3.11) 

4 Examples of Soft-Decision Iterative Decoding 
Algorithm 

For 1 < i < j < iV, define [i,j] = {i,i + 1, . . . ,j}, called an interval. For 
X = {xi,X 2 , ■ ■ ■ ,xn), V = (vi,V 2 , ■ ■ ■ ,vn) € and a nonempty set I C [1, A^], 
define 



dHj{x,v) = \{i e I : Xt ^ Vi}\. (4.1) 

For / = {fi, i 2 , . . . , im} with I < ii < h < • ■ ■ < irn < N, 

pi{x) = {xi^,Xi^,... ,Xi^). (4.2) 

We abbreviate dHji,j](®,i’), dH,[i,Ar](a;, I)) andp[ij](a;) as dny j(a:, i’), d]i{x,v) 
and pij{x), respectively. 

For simplicity, we assume that the bit positions 1,2,... ,N are ordered ac- 
cording to the reliability order given as follows: 

Vi\ < \rj\, iov 1 <i < j < N. (4.3) 

For nonnegative integers s and t such that s -I- 2t < dmin — 1 and u G , the 
decoding which corrects s erasures in the first s bit positions and t or less errors 
in the remaining bit positions of input u is called (s, t)-decoding with respect 
to u. We consider next several examples. 

Example 4.1: GMD-Like Decoding: Define p as: 0 for even dmin, and 1 
for odd dmin- For u G , GMD(tt) decoding is defined as follows: For 1 < 
j < p = (dmin +p)/2, the subprocess G of the j-th stage of GMD(m) is the 
{2j — p — l,p — j)-decoding with respect to u. GMD(tt) terminates at the p-th 
stage. GMD( 2 ;) is the original GMD proposed by Forney [16]. Then for 1 < j < p, 

R{j) = {v : dii, 2 j-p,N{v, u)< p- j}. (4.4) 

It follows from (2.5) and (4.4) that 

\ Rgmd{u) = {v gV^ : dB_, 2 i-p,N{v, u)> p-i, for 1 < z < p}. (4.5) 
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Example 4.2: We introduce “Multiple-GMD” . For a positive integer h, ft,-GMD 
consists of successive GMD(m^®)) for I < i < h, where = z and 



tt(d A \ i?GMD(«o-)))]. for 1 < t < tl. 


(4.6) 


It is shown in [17] that + z = 


{u^i\u'2\--- where 




( 2 ) J 1, if ti -1- p is even and 1 < (ti -|- p)/2 < p, 
^ (0, otherwise. 


(4.7) 


Example 4.3: Generalized Chase Algorithm [18]: Bounded distance-to 
decoding around an input word v{j) G , called the j-th search center, is used 

for generating c(j), where 0 < to < (dmin — l)/2. Then, 


C\R{j) C Orf(j)(ti(j)), 


(4.8) 


where 






u{j) = < 


f c(j), if c(j) yf *, 
[v{j), if c(j) = *, 


(4.9) 


d{j) = 1 


dmin; if o(j) yf 
to + l, if c(j) = *. 


(4.10) 



For choosing v{j), the following method is proposed in [15]: Ghoose v(j + 1) 
such that 



v{j+l) = v[Vf^a2,...,d^(ui,U2,... ,Uh)], (4.11) 

where U\, U 2 , ... ,Uh are chosen from u(i) with 1 < i < j defined by (4.9). 

Example 4.4: Multiple Chase-Like Decoding Algorithm: For 1 < r < A 

and u G , Ghase(u) denotes a 2'^ stage decoding algorithm whose j-th stage 
is bounded distance-to decoding with input u + e(j), where to = [('^min — 1)/2J 
and e(j) is the j-th test error pattern in Er = {x G : Pr+i,N{x) = 0}. 
Ghase( 2 ;) is the original Ghase decoding algorithm [11]. Then 

\ -Rchase(u) = {x G : d}i^T+l,N{x,u) > to + !}■ (4-12) 

For a positive integer h, ti-Ghase consists of successive Ghase(u(*)) with 1 < t < 
h, where called the t-th search center, is given by = 2 : and for 1 < t < ti 

= v[{x G : du,T+i,N{x,u^^'>) > to -I- 1, for 1 < j < t}]. (4.13) 

Example 4.5: Decoding Based on Ordered Statistics [7]: Let C be a 

binary linear (N,K) code, and let Mk = where 1 < < 

i 2 < • ■ ■ < iK < N — drain + 1, be the most reliable basis (MRB) [7]. For 
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X = (xi , X 2 , • ■ ■ , xk) G , there is a unique codeword 7 ( 0 ;) = (ci, C 2 , . . . , cn) 
in C such that 

Cih = Zih + Xh, for 1 < h < K. (4.14) 

For 0 < h < K, define be the set of those binary iF-tuples whose weights 
are h or less. The order-/i decoding [7] generates candidate codewords {'^{x) : 
X G Vf^} iteratively in an increasing order of the weight of x. The search region, 
denoted Ros-h, is the subcode of C defined by 

Ros-h = { 7 ( 3 ^) : X G V^} = {v gC : (Ih.Mk (■», 2 ) < h}. (4.15) 

Similarly, we can treat the decoding algorithm in [19]. 

Example 4.6: Iterative MWTS Decoding Algorithm: In the iterative mini- 
mum-weight subtrellis search (MWTS) decoding algorithm [6] for binary linear 
codes, the first candidate codeword c(l) is obtained by the ordered statistics 
decoding, the zero-th order or the first order [7]. At the j-th stage with j > 1, the 
next candidate codeword c(j) is the best codeword in {v G C : dniv, c{j — 1)) = 
rci} which is obtained by MWTS around c{j — 1), where wi{= dmin) is the 
minimum weight. Let W 2 be the second smallest weight of C. Then, 

R{j) = 0»i(c(j - 1)) \ {c(j - 1)}, for j > 1, (4.16) 

and we have that C \ R{1) C 0^,1 (c(l)) and for j > 1, 

C \ RU) = c \ {O^Mj - 1)) \ {<J - 1)}} 

C (0„,(c(j - l))U{c(j - l)})f]OMj))- (4.17) 

5 Formulation as Integer Programming Problem 

For the decoding algorithms stated in Sec. 4 and in [13,14], the derivation of 
the lower bounds Lg, L^q and Let the selection of search centers can be 
uniformly formulated as the following optimization problem. 

For a received sequence r = {ri,r 2 , ■ ■ ■ , tn), a set If C [1, h], nonempty sets 
Ij C [1,A], Uj G V^, and positive integers dj < N for 1 < j < h, find the 
following lower bound 



iL{x), 



(5.1) 



where x G subjects to the inequalities 






> dj^ for j G 17, 
1 = dj, for j ^ E, 



(5.2) 



where U\,U 2 , ■ ■ ■ ,Uh are called reference words. 

Define 1 = (1, 1, . . . , 1). Note that dujj{x,Uj) < dj if and only if 



dnj^{x,Uj -I- 1) > \Ij \ - dj, for I < j < h. 



(5.3) 
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Define y = x + z hy yi = Xi + Zi(mod 2) for 1 < i < N. Then 

N 

L{x) = '^\n\y^. (5.4) 

For V G , 1 < j < h and I C [1, N], define 

Do,i{v) ^{1,2,... ,N}\I, Dij{v) ^{iGl:Vi = zj, (5.5) 

D-ij{v) A {i G I :vi^ Zi}, n-ij{v) = \D-ij{v)\. (5.6) 

Since dH,/(£c, Uj) = (y, Uj + z), 

dujj{x,Uj) = ^ y,- ^ yi + n-ijj{uj). (5.7) 

For a vector a = {a\, 02 , ■ • ■ , oth) G { — 1, 0, 1}^, define 

^ |D„|. (5.8) 



Define A = {ol G {—1, 0, 1}^ : Da ^ 4>}- Since Gta^ADa = [1, N] and Daf^Da' = 
(j) for a. ^ a' , D = {Da : a € is a partition of [1, N] and therefore |A| < N. 
For 1 < z < iV and 1 < j < h, let aji be the j-th component of the vector a. G A 
with z G Da- Then Lemma 4 follows from (5.4) to (5.8). 

Lemma 4. The optimization problem (5.1) and (5.2) can he expressed as the 
following integer programming problem: 

N 

min E \n\yi, (5.9) 

where yi G {0, 1} subjects to the equalities 

N 

aj^yi = 6j + (Tj, for I < j < h, (5.10) 



where 6j = dj — ri-iy.{uj), and aj is a slack variable such that aj > 0 for j G S 
and Uj = 0 for j ^ E. 

For the above integer programming problem, denoted IPP, a binary A'^-tuple 
y which meets the constraints is called feasible and a feasible y at which the 
object function takes the minimum is called optimal. For a G A, define 

9a = ^ yi- (5.11) 

Then 



0 < |9a| < n, 



(5.12) 
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Equalities (5.10)can be expressed as 

ajQoi = Sj + Uj, for 1 < j <h, (5.13) 

aeA 

where aj denotes the j-th component of a. Let Q denote the set of those |^|- 
tuples over nonnegative integers {qa ■ ol G A)’s which satisfy (5.12) and (5.13). 
By using vector representation, equalities (5.13) are expressed as 

^qaa = S + (T, (5.14) 

aeA 

where S = (Si,S 2 , ... ,Sh) and cr = (cti, (J 2 , . . . ,cr/i). 

For a subset X C [1, N] and an integer m with m < |X|, let X[m] denote the 
set of m smallest integers in X for 1 < m < |X| and the empty set (p for m <0. 
Then, from (4.3), (5.9) and (5.11), the optimal value of the object function of 
the IPP can be expressed as 



min \ri\, (5.15) 

where the minimum is taken over Q. 

Define q = {qa a G A), where the components are ordered in a fixed 
order. Then q defined by (5.11) from an optimal solution y of IPP is also called 
optimal. Conversely, for an optimal q, the solution y whose i-th bit is 1 if and 
only if i G Da[qa] for <y. G A such that i G Da is optimal from (4.3) and (5.11). 
Hereafter, we consider the IPP in terms of qa with otGQ. 

For S C A, S' C X and an /i-tuple c = (ci,C2,... ,Ch) over nonnegative 
integers such that cj = 0 for j ^ S, let IPP(S', S', c), called a sub-IPP, denote 
the integer programming problem whose constraints are those of the IPP and 
the following additional ones: 

qa = 0, for a ^ S, 

_ / > Cj, for j G S', 

\= Cj, ior j ^ S' . 

We say that the IPP can be reduced into a set of sub-IPP’s if and only if there 
exists an optimal solution q of the IPP which is an optimal solution of one of 
the sub-IPP’s. 

For example, suppose that the j-th restriction is d^{x,Uj) > dmin with 
Uj G C and it is enough to consider a; to be a codeword of C. Since the restriction 
is equivalent to du{x, Uj) = wi or dn(x, Uj) > W 2 , IPP can be reduced to sub- 
IPP’s. 

Define S{q) = {a G H : go, > 0}. The following lemma and simple corollary 
are useful for the reduction of IPP. Lemma 5 follows from (4.3), (5.14) and (5.15). 



(5.16) 

(5.17) 
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Lemma 5. Let q be optimal. Then (i) for any nonempty subset X of S{q), 
^ cr and (ii) for ct G S{q) and a! G A such that Ua' > qa' , if the qa- 
th integer of Da is greater than the {qa> + ^)-th integer of Da', then ol — o! ^ cr. 

Corollary 1. Let q be optimal. Then (i) a. ^ S{q), for a G {0,-1}^, (ii) 
{a, —a} (f- S{q), for any a. G A, hence |S'(<7)| < min{7V, 3^/2}, (Hi) at least 
one component of cr is zero, and (iv) if there are a. G S{q) and i G [1,^] such 
that ai = 1 and aj = —1 for j i, then (Jj = 0 and (3i = 1 for all (3 G S{q). 

Some further details can be found in [14]. As the number of the reference 
words grows, the computational complexity for solving IPP grows. We may 
choose a relatively small subset of reference words to derive a little weaker but 
still useful lower bound. 

The reference words u\, U 2 , ..., Uh in (5.2) can be partitioned into two 
blocks, denoted RWi and RW2, in such a way that Uj sRWi if and only if Uj 
is chosen independently of any candidate codeword generated already. A block 
RWi or RW2 may be empty. Examples of RWi are {z} for L^q in [13], {z, z + 1} 
for Lgqn in [14], {i’(l), . . . , v{j) : decoding failures continue up to the j-th stage} 
for finding u(j + l) in Example 4.3, . . . , for finding in Example 

4.4 and {z} for L^^. or Lg in Example 4.5. 

Assume that the first hi reference words are in RWi. For 1 < z < hi, 
define Ai = {pi^i{ot) ■. ol G A} and for (3 = {(3i,(32,... ,(3i) G Ai, define 
33/3 — bl}=i^/3j,T For v G and / C [1, A^], if Daj{v) with a G {—1,0, 1} 
is empty or an interval, then (v,L) is called interval type. For an interval / or 
I = (j), {z,L) and {z + 1,/) are interval type. If U/ with z > 1 is an optimal 
solution of IPP with reference words Ui, U 2 , . . ., Mz-i such that {uj,Lj) is in- 
terval type for 1 < j < z, then it follows from (4.3) that Ui is also interval type. 
Suppose is interval type for 1 < j < z. Then Dp is an interval for (3 G A/, 

that is, there are two integers r'i(,d) and n2(/3) such that Dp = [vi{(3),i'2{^)\- 
For different j3 and (3' in A/, we write (3 < (3' ii and only if V2{(3) < vi{!3'). Since 
33 p n Dp' = (j>, f3 < f3' or f3' < /3 holds. This simplifies the solution of IPP. For 
instance, (ii) of Lemma 5 becomes the following more useful version: 

(ii’) For 1 < i < h, let (3 and 0 be in A/ such that 0 < (3. Then, for cx and ot' 
in A such that pi^i{cx) = (3 and piy(a') = 0 , either qa = 0, or qa> = Ua', or 
ex — ol' ^ cr. 

Thus, it is better to choose a relatively small number of reference words in 
RW2. Reasonable choice rules are: i) give priority to a reference word Ui with a 
larger di or a smaller L{ui); and ii) to renew the test condition, the candidate 
codeword generated most recently must be chosen. 

Example 5.1: Consider the evaluation of Lg /^ = L[V0^^^ ^0Ui,U2, . . . , Uh)]. 

For this case, RWi = (p, E = [l,iV] and 3j = [1, A^] for I < j < h. Formulas for 
Lg 2 and Lg 3 were presented in [3,4] and an efficient algorithm for computing 
Lg 4 is presented in [5] . Upper bounds of the number of real number additions 
and comparisons for evaluating Lg ^ with 2 < h < A have been derived in [20] 
for 2 < A < 3 and [5] for h = A. Table 1 lists the upper bounds under the 
assumption of (4.3) and the number of sub-IPP’s. 
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Tab. 1 Upper bounds of the number of real number additions and comparisons for 
evaluating Lg ^ and the number of sub-IPP’s, where 5 = mini<i</i{5i}. 



h 


Upper Bound 


The number of sub-IPP’s 


2 


<5-1 


1 1 


■3 


10(5-2 


2 


4 


310(5^ -b 184(5 - 1 


9 I 
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Abstract. The use of curves in algebraic geometries in applications such 
as coding and cryptography is now extensive. This note reviews recent 
developments in the construction of such curves and their application to 
these subjects. 



1 Introduction 

The use of curves in algebraic geometries, in applications such as coding and 
cryptography, has now an extensive literature and has been responsible for dra- 
matic developments in both subjects. In coding theory, the use of modular curves 
to construct asymptotically good codes i.e., to construct an infinite sequence of 
codes whose parameters asymptotically exceed the Varshamov-Gilbert bound 
over alphabets of size > 49, was a remarkable achievement. Likewise, the intro- 
duction of the theory of elliptic curves to public key cryptography has opened 
up a new area of both practical and theoretical significance. 

This note reviews some of the current developments and directions of research 
in areas related to the construction of curves with many points. Of necessity it is 
highly selective. The term ‘curves with many points’ refers to curves whose set 
of points has order close to the maximum allowed by the Basse- Weil theorem, 
namely g -I- 1 + 2g^^. The construction of such curves is an interesting subject 
in its own right. The use of such curves in coding theory results in codes of long 
length, relative to the dimension and distance, in comparison to codes obtained 
from other curves of the same genus and alphabet size. 

The application of elliptic curves to cryptography is to replace the more 
usual finite groups, such as the multiplicative group of a finite field, with the 
group of points on the curve under point addition. As such, the construction of 
curves with many points (which in the case of elliptic curves means the upper 
bound of g -b 1 + 2yfq) is less important than the size of the largest prime order 
subgroup of points. However, by recourse to the prime number theorem and 
the fairly uniform distribution of curve orders over the interval q + 1 ± 2yjq, as 
the curve parameters range over all allowable values, one can argue that curves 
with many points are of interest in this case as well. The reason for interest 
in such cryptosystems is that the same level of security can be achieved with 
much smaller key lengths compared to the more conventional systems, resulting 
in more efficient implementations. 
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This work reviews the subject and its applications, with a particular em- 
phasis on recent developments and directions. The next section presents some 
background material needed for a discussion of the subject and its applications 
to coding theory and cryptography, which are considered in the succeeding sec- 
tions. 



2 Preliminaries 

A very brief introduction to the background needed to discuss the problems 
addressed in this work is given. More complete introductions to the material are 
given in [32], [13], [43] and [3]. 

Let Fg be a finite field and its algebraic closure. Let f{x,y) G F^[x] be a 
bivariate polynomial over Fg and consider the set 

C = {(u,u) G Fg X Fgl/('u,u) = 0} 

i.e. the set of solutions of the equation f{x, y) = 0 over Fg. Each such solution is 
referred to as a point of the curve. The subset of these points with coordinates 
in Fg is referred to as the set of rational points. We will only be interested in the 
case where f{x,y) G ¥q[x,y\ is an irreducible polynomial over Fg and where the 
curve is smooth i.e. contains no non-singular points, points where at least one of 
the formal derivatives vanishes. There is an equivalent approach to the subject 
in terms of function fields for which we refer to [43] . 

A divisor of the curve C is a finite formal sum 

L? = ^ WjPj , mi G Z 
PiGC 



where only a finite number of the integers rrii are nonzero. The degree of the 
divisor is rm. Denote by D the set of all divisors of C, an abelian group, and 
by D° the set of divisors of degree 0, a subgroup of D. Denote I? > 0 if m^ > 0 
for all i. 

The greatest common divisor (gcd) of divisors D = rmPi and D' = 
'^^m'iPi is defined to be gcd(D, £>') = ^ - min(mi, m'i)Pi — kO where k = 

min(mi, m'i) i.e. the gcd is a divisor of degree 0. To a bivariate polynomial 
g{u, v) G Fq[u, v] can be associated the divisor (g) = ^ - rmPi — kO G ID)° where 
k is chosen so the divisor has degree 0, and the integer rrii represents the order 
of vanishing of g at Pi. Associated with the rational function g(u,v)/h(u,v) is 
the divisor (g) — (h). A divisor of this form is referred to as a principal divisor. 
Clearly all such principal divisors have degree 0 and they form a subgroup P 
of 1D>°. The factor group D°/P is referred to as the jacobian J of the curve and 
will play a role in the formulation of cryptosystems using hyperelliptic curves. 
Notice that notions of divisors discussed above hold also for curves and their 
corresponding divisors defined with polynomials of a single variable. Associated 
with a curve is an invariant g, the genus of the curve, which is a non-negative 




Curves with Many Points and Their Applications 



57 



integer which is related to the dimension of certain subspaces associated with 
divisors of the curve. 

A fundamental result of curves is the Hasse-Weil theorem which states that 
the number of rational points, Ni, on a curve of genus g is bounded by the 
relation 



\Ni-(q + l)\<2gq^^\ 

The right hand side of this bound was improved by Serre [42] to g[\/2q^]. 

When the genus of the curve is large compared to q, this bound is not very 
effective. A more accurate bound is given in this case in a theorem by Oesterle 
(and reported by Schoof [38]). 

It can be shown that the number of solutions (of the curve defined over Fg) 
over Vq-r, r > g, \s determined by N\, N 2 , • • • , Ng where Nt is the number of 
solutions over F^t. The relationship is made explicit via the zeta function. 

Define a curve of genus g over F, with Ni points to be maximal ii N\ = q + 
1 + 2gq^^'^. To achieve this upper bound it is clearly necessary that g be a square, 
q = . For a given genus g and finite field size q, let Nq{g) be the maximum 

number of points on any curve of genus g over F^, Nq{g) < q+l + 2gq^^^ . A curve 
with this maximum number of points Nq(g) will be called optimal. Some limited, 
specific results on the existence of certain classes of optimal and maximal codes 
are known (e.g. [12]). A curve that contains ‘almost’ g + 1 + 2gq^!'^ points will 
be referred to as ” a curve with many points” . Subsequent sections survey recent 
contributions to the problem of constructing curves with many points. 
Asymptotics: Define the quantity A{q) = limsupg^o^ Nq{g)/g and notice from 

the Hasse-Weil bound that A{q) < 2q^!'^ or, using the Serre improvement, 
A{q) < [2g^/^]. A variety of upper and lower bounds on this quantity are known 
([25], [44], [8], [42], [50], [40]). by exhibiting a suitable infinite family of mod- 
ular curves. Niederreiter and Xing [37] have recently established a variety of 
lower bounds on A{q) including the fact that, for q odd and m > 3, A{q"^) > 
2g/[2(2<7 -I- 1)^/^] -1-1. Additionally they show that A(2) > 81/317, A(3) > 
62/163, and A(5) > 2/3 improving on the previously best bounds of these cases 
of 2/9, 1/3 and 1/2 respectively. Many other bounds on A{q), both upper and 
lower exist, but are not pursued here. 

Construction Techniques: A variety of techniques have been used to establish 

curves with many points and a web site containing many of the ‘best’ curves 
available is maintained [17]. The techniques have been categorized as: 

I Methods from general class field theory; 

II Methods from class field theory based on Drinfeld 
modules of rank 1; 

III Fibre products of Artin-Schreier curves; 

IV Towers of curves with many points; 

V Miscellaneous methods such as : (i) formulas for Nq{l) and N2{q); 

(ii) explicit curves, curves, (iii) elliptic modular curves 

(iv) Deligne-Lusztig curves; (v) quotients of curves with many points. 
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Numerous techniques to construct curves with many points that depend on 
coding theory are given in [18]. In essence, the codewords lead to families of 
curves and low weight codewords lead to curves with many points. The role 
of coding theory to yield linear spaces of curves with many points is then of 
interest. 

Elliptic Curves It can be shown [31] that an elliptic curve over Fg can always be 
reduced to the form: 

E : + aixy + a^y = + a2X^ + a^x + ae , Oj G . 

The set of solutions of the equation over F^, together with the point at infinity 
O, form the points of the curve E{¥q). All such curves have genus 1. The above 
(Weierstrass) equation can be further reduced to one of two forms, depending 
on whether the characteristic of the field is even or odd. 

The maximum number of points such a curve of genus 1 can have over Fg is 
completely determined ([41], [45]). For g = if a is an odd integer, a > 3 and 
p|[2g^/^], then q is called exceptional. Then 

, , _ J <7 + 1 + [ 2 ( 7 ^/^], q nonexceptional 
® \ <7 + [ 2 ( 7 ^/^], q exceptional. 

Partial results are available for g = 2. There also exists necessary and sufficient 
conditions [45] for the existence of a curve with exactly N points when the 
characteristic does not divide {N — q — 1). 

Hyperelliptic Curves As a generalization of elliptic curves, consider the hyperel- 
liptic curves, defined by an equation of the form: 

+ h{x)y = k{x) , h{x), k{x) G F^[x] 

where h{x) is a polynomial of degree at most g and k{x) is monic of degree 
exactly 2g + 1. We require the curve to be smooth i.e. have no singular points 
(x, y) G F^ where both of the partial derivatives 2y+h{x) = 0 and h'y—k'{x) = 0 
vanish([27]). In such a case the genus of the curve is g. Notice that for elliptic 
curves h{x) is at most a linear polynomial and k{x) a monic cubic, and the curve 
is of genus 1 . 

If char(Fg) yf 2 then the change of variables x x, y ^ y — {h{x)/2) 
transforms the equation to y^ = /(x) , deg/(x) = 2g + I .In this case, if 
P = (x,y) is a point on the curve, then {x,—y — h{x)) is also a point on the 
curve, the sum of P and the point at infinity. If char(Fg) = 2 and (x,y) is on 
the curve, then so also is {x,y + h{x)). This point will be designated P. 

Just as the zeta function leads to the enumeration of points on a curve over 
extensions of the base field, given the number on the first g extensions, a slight 
modification leads to the enumeration of the number of points in the Jacobian 
JJ(Fgr), r > g, given the number of points on the curve for r = 1, 2 , . . . , 77 [27]. 
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3 Curves in Coding Theory 

For a given divisor G on a curve C define a subset of rational functions on the 
curve by 



F(G) = {/gF,(x,2/)|(/)+G>0}U{O} . 

Clearly L{G) is a finite dimensional vector space over Fg. Denote its dimension 
by 1{G). A consequence of the Riemann-Roch theorem [10] is then that 

1{G) > deg(G) + l-g 

where g is the genus of the curve. Indeed for a divisor G with deg(G) > 2g — 2, 
this equation will hold with equality. 

To construct the class of codes that will be of interest, let C be a nonsingular 
irreducible curve over F^ of genus g and let Pi , P 2 , • • • ,Pn be the rational points 
on C and D = Pi + P 2 + • • • + P„, a divisor. Let G be a divisor with support 
disjoint from D and assume that 2g — 2 < deg(G) < n. 

Define the linear code C{D, G) over F^ as the image of the linear map ([43]) 

« : L(G) F^ 

/ ^(/(Pi),/(P2),... ,/(P„)) . 

This is often referred to as the evaluation construction of codes from curves in 
the literature. 

The parameters of the code C{D, G) are established by using the properties 
discussed above. The kernel of the map is the set P(G — D) and, for the restric- 
tions on the parameters noted, k = dimG(P, G) = dimP(G) — dim(G — D) = 
deg(G) — g + 1. The minimum distance of G{D,G) is d > d* = n — deg(G), 
the designed minimum distance. The other standard construction for codes, the 
residue construction, uses the idea of differentials of the curve, and will not be 
considered. 

Using the above evaluation construction of codes for elliptic curves, one can 
define a class of codes over Fg with the parameters length n < q + 1 + 2^/q, 
dimension k and minimum distance n — k. Since the distance is one away from 
the Singleton bound, one often refers to this as a code with ‘defect’ of 1. Further 
information on codes from elliptic curves is given in [21]. 

One can also use the construction on hyperelliptic curves and Xing [46] deter- 
mines certain classes of such codes whose minimum distance can be determined 
exactly. 

Other aspects of the relationship between algebraic curves and codes are 
touched upon. The first establishes the connection between codewords of low 
weight in certain trace codes and curves with many points. The emphasis here 
will be on Reed-Muller (RM) codes and follow [20]. The generalized RM code 
Rq{s,m) can be defined as follows. Consider the vector space (over F,) 



Ps = {f& F,[Ai, A 2 , . . . , A„]|deg(/) < 4 
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and denote the evaluation map by /3 : Ps — > F”, / i — > The 

kernel of this map is the ideal generated by the polynomials Xl — Xi and a 
polynomial f G Ps is called reduced if / is an Fg linear combination of monomials 
^ ^ ^ 0 < di < <7 — 1. The image /3(Ps) is the RM code Rq{s, m) 

of order s in m variables. 

It can be shown that the evaluation codeword corresponding to certain poly- 
nomials / can in fact be written in the form: 

Cf = {Tr{R{x)))xG¥^ for some R{x) G Fgm[a;]. 

An element in Fg*n has trace 0 if and only if it can be written in the form y‘^ —y for 
some y G Fgm . Now if we associate to the codeword c / the irreducible complete 
smooth (Artin-Schreier) curve C / over Fgm given by the equation y'^ — y = R(x), 
then the curve has genus g{Cf) = {q— l)(deg(i?) — l)/2. It follows immediately 
that 



w{cf) = q-^-{\Cf{Wq^)\-l)/q, 

giving an interesting relationship between codewords of low weight and curves 
with many points. In [20] these considerations are also shown to lead in some 
cases to maximal curves, achieving the Hasse-Weil upper bound for the case of 
small genus i.e. ([16], [20]) g < {y/q — 1)^/4 or g = (q — ^Jq)l2 (see [49]). Much 
of this work exploits the connection between generalized Hamming weights and 
curves with many points. 

Another contribution to the relationship between codes and curves, is the 
work on Niederreiter and Xing [37] on the existence of asymptotically good 
codes, i.e., sequences of codes whose rate/distance profile asymptotically exceeds 
the Varshamov-Gilbert bound. They are able to show a result equivalent to the 
celebrated one [44] for q a square. Specifically it is shown that for m > 3 an odd 
integer and r a prime power > lOOm^ for odd r and > 576m^ for even r then 
there is an open interval on (0, 1) which contains the point (r™ — l)/(2r™ — 1) 
where the Varshamov-Gilbert bound can be exceeded. The result is achieved by 
establishing lower bounds on A{q"^), some of which were mentioned earlier. 



4 Curves in Cryptography 

The notion from public key cryptography that will be required for our presen- 
tation is that of a one-way function, on which cryptographic protocols , such as 
the Diffie-Hellman key exchange, depend. Other protocols, such as digital sig- 
natures, authentication and others follow readily, depending on the function in 
use. The particular one-way function of interest here is that of the discrete log- 
arithm. In the multiplicative group of the finite fields F 2 m or Fp, p a prime, the 
order of complexity of the discrete logarithm problem in both time and space, 
is L(l/3,c, iV) where 

L{v,c,N) = exp{c(logA^)’'(loglogV)^-’^} 
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for some constant c where N is the largest prime divisor of the multiplicative 
group order. For a prime field, the constant c is approximately 1.6. This com- 
plexity function represents subexponential growth in log N. Establishing this 
complexity for the discrete logarithm problem uses a technique known as index 
calculus, which is facilitated by the group actually having two operations, re- 
siding as it does in a finite field. To ensure the discrete logarithm problem is as 
difficult as possible, the cyclic group order, q — 1, should have as large a prime 
divisor as possible. 

The essence of the application of curves, either elliptic or hyperelliptic, to 
the discrete logarithm problem, is to replace the groups previously used with 
additive groups defined from the curves. The presentation of these groups and 
the group operations, is the focus of the cryptographic aspects of the problem. 

On an elliptic curve it is possible to define a point addition such that the 
sum of any two points on the curve uniquely defines a third turning the set of 
curve points into a commutative additive group. The identity or neutral element 
of the group is the point at infinity O. The discrete logarithm problem in the 
additive group of points on the curve is: given a point P of large prime order N 
and a point multiple c • P = P + ■ ■ ■ + P (c times), determine the integer c. No 
subexponential algorithm is known to attack this problem. The most efficient 
algorithm known is the so called square root attack (or baby-step-giant-step) 
which has complexity on the order of the square root of the largest prime divisor 
of the group order. The essence of the problem appears to be the lack of an index 
calculus for it, as was possible for F*. 

A problem of some importance for the use of elliptic curves in cryptogra- 
phy, is the ability to determine precisely the number of points on the curve, 
or equivalently to determine curve parameters that yield a curve whose group 
order contains a large prime divisor. This so-called ‘point counting’ problem is 
both a theoretically and practically challenging one. The more general problem 
of counting points on curves of higher genus is even more so [9] . 

With the success of using elliptic curves for public key cryptosystems, it 
is natural to consider the use of other curves. A group structure is required in 
order to uniquely carry out the public key operations and no such addition seems 
possible on more general curves. However, it is possible to define an addition on 
the Jacobian of a hyperelliptic curve and we give a few comments on this situation 
here. 

The definition of the Jacobian of a curve given previously is valid for any 
curve. There are two reasons however why the Jacobian of a hyperelliptic curve 
is especially interesting [27]. Since the definition of the Jacobian is as a quotient 
group of two infinite groups, it is necessary to be able to describe in an efficient 
manner, representatives of cosets or, to be able to determine when two divisors 
are in the same coset, and hence identified. Secondly it is important to be able 
to add two elements of the Jacobian and, again, identify the coset the result is 
in. For the Jacobian of a hyperelliptic curve, these two problems have effective 
algorithms. For the first problem, it can be shown that every element in J1 can 
be uniquely represented as a reduced divisor defined as follows: 
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Definition 1. A divisor D = Y^ rmPi — (*)oo G D° is said to be reduced if: 

(i) All the mi are non-negative and m\ <1 if Pi = Pi. 

(ii) If Pi Pi then Pi and Pi do not both occur in the sum. 

(Hi) X) < ff- 

Indeed any reduced divisor can be uniquely represented as the gcd of poly- 
nomials of a certain form. Likewise, an algorithm to add two divisors is also 
possible (see [27], Section 5. and Appendix) although it is too detailed to give 
here. 

Hyperelliptic curve cryptosystems have been seriously investigated by several 
prominent researchers. To date they seem to have disadvantages with respect to 
efficiency of operations, compared to elliptic curve systems at the same level 
of security and they have not been developed very far. At the same time a 
subexponential algorithm of complexity 0{\/g log g), for fixed prime p, has been 
found for hyperelliptic curves of large genus, although no such algorithm has 
been found for the lower genus case. 

5 Comments 

The construction of curves with many points is an interesting area in its own 
right. Not only have these constructions given insights to the structure of such 
curves, they have also influenced developments in areas such as coding theory 
and cryptography, as touched upon in this article. New applications continue to 
be found such as low discrepancy sequences, sequences with low correlation and 
authentication codes, largely by Xing, Niederreiter and their co-authors. The 
continued development of constructions of such curves and their applications 
will be followed with interest. The set of references included here goes beyond 
the topics described, as a convenience for the reader wishing to pursue a topic. 
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Abstract. We will introduce a new class of erasure codes built from 
irregular bipartite graphs that have linear time encoding and decoding 
algorithms and can transmit over an erasure channel at rates arbitrarily 
close to the channel capacity. We also show that these codes are close 
to optimal with respect to the trade-off between the proximity to the 
channel capacity and the running time of the recovery algorithm. 



1 Introduction 

A linear error-correcting code of block length n and dimension k over a finite field 
Fg — an [n, fc]q-code for short — is a /c-dimensional linear subspace of the standard 
vector space F”. The elements of the code are called codewords. To the code C 
there corresponds an encoding map Enc which is an isomorphism of the vector 
spaces F* and C. A sender, who wishes to transmit a vector of k elements in 
Fg to a receiver uses the mapping Enc to encode that vector into a codeword. 
The rate k jn of the code is a measure for the amount of real information in each 
codeword. The minimum distance of the code is the minimum Hamming distance 
between two distinct codewords. A linear code of block length n, dimension k, 
and minimum distance d over Fg is called an [n, k, d]g-code. 

Linear codes can be used to reliably transmit information over a noisy chan- 
nel. Depending on the nature of the errors imposed on the codeword during 
the transmission, the receiver then applies appropriate algorithms to decode the 
received word. In this paper, we assume that the receiver knows the position of 
each received symbol within the stream of all encoding symbols. We adopt as 
our model of losses the erasure channel, introduced by Elias [3], in which each 
encoding symbol is lost with a fixed constant probability p in transit indepen- 
dent of all the other symbols. As was shown by Elias [3], the capacity of this 
channel equals 1 — p. 

It is easy to see that a code of minimum distance d is capable of recovering 
d— 1 or less erasures. In the best case, it can recover from any set of k coordinates 
of the encoding which means that d — 1 = n — k. Such codes are called MDS- 
codes. A standard class of MDS-codes is given by Reed-Solomon codes [10]. The 
connection of these codes with polynomial arithmetic allows for encoding and 
decoding in time O(nlog^nloglogn). (See, [2, Chapter 11.7] and [10, p. 369]). 
However, these codes do not reach the capacity of the erasure channel, since 
there is no infinite sequence of such codes over a fixed field. 
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Elias [3] showed that a random linear code can be used to transmit over the 
erasure channel at any rate R < 1 — p, and that encoding and decoding can be 
accomplished with 0{n?) and 0{n^) arithmetic operations, respectively. Hence, 
we have on the one hand codes that can be encoded and decoded faster than 
general linear codes, but do not reach the capacity of the erasure channel; and 
on the other hand we have random codes which reach the capacity but have 
encoding and decoding algorithms of higher complexity. 

The paper [1] was the first to design codes that could come arbitrarily close 
to the channel capacity while having linear time encoding and decoding algo- 
rithms. Improving these results, the authors of [8] took a different approach and 
designed fast linear-time algorithms for transmitting just below channel capac- 
ity. For all e > 0 they were able to produce rate R = 1 — p(l -I- e) codes along 
with decoding algorithms that could recover from the random loss of a p fraction 
of the transmitted symbols in time proportional to nln(l/e) with high proba- 
bility, where n is the block length. These codes could also be encoded in time 
proportional to nln(l/e). They belong to the class of low-density parity check 
codes of Gallager [4]. In contrast to Gallager codes, however, the graphs used to 
construct the asymptotically good codes obtained in [8] are highly irregular. 

The purpose of the present paper is twofold. First, we prove a general trade-off 
theorem between the proximity of a given Gallager code to the channel capacity 
in terms of the loss fraction and the running time of the recovery algorithm of [8] . 
We show that in this respect, the codes constructed in that paper are close to 
optimal. Next, we exhibit a different sequence of asymptotically close to optimal 
codes which have better parameters than the codes in [8] . An interesting feature 
of these codes is that the underlying bipartite graphs are right regular, i.e., all 
nodes on the right hand side of the graph have the same degree. Since they are 
theoretically better than their peers, we expect them to also perform better in 
practice. 

The organization of the paper is as follows. In the next section we will re- 
view the construction of Gallager codes. Next, we prove upper bounds on the 
maximum tolerable loss fraction in terms of the running time of the decoding al- 
gorithm. The last two sections are concerned with the derivation of the sequence 
of right regular erasure codes. 



2 Codes from Bipartite Graphs 

In this section, we will briefly review the class of codes we are interested in, and 
the erasure recovery algorithm associated to them. 

Our codes are similar to the Gallager codes [4] in that they are built from 
sparse bipartite graphs. In contrast to Gallager codes, however, our codes will 
be constructed from graphs that have a highly irregular degree pattern on the 
left. 

Let G be a bipartite graph with n nodes on the left and n — k nodes on the 
right. G gives rise to a binary code of block-length n and dimension > fc in the 
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following way: let the adjacency matrix of the graph G be given as 




where H is some (n — k) xn matrix describing the connections in the graph. The 
code defined by the graph is the code with parity check matrix H . A different 
way of describing the code is as follows: we index the coordinate positions of 
the code with the n nodes on the left hand side of the graph. The code consists 
of all binary vectors (ci , . . . , c„) such that for each right node in the graph the 
sum of the coordinate places adjacent to it equals zero. The block-length of this 
code equals n, and its dimension is at least k since we are imposing n — k linear 
conditions on the coordinates of a codeword. Expressed in terms of the graph, 
the fraction of redundant symbols in a codeword is at most QL/an where a l and 
an are the average node degrees on the left and the right hand side of the graph, 
respectively. In other words, the rate of the code is at least 1 — ahjan- This 
description of the rate will be useful in later analysis. In the following, we will 
assume that the rate is in fact equal to this value. This is because the statements 
we will prove below will become even stronger if the rate is larger. 

The above construction needs asymptotically arithmetic operations to 

find the encoding of a message of length k, if the graph is sparse. One can apply 
a trick to reduce the running time to 0(n) by a modification of the construction. 
Details can be found in [8]. 

Suppose now that a codeword (ci, . . . ,c„) is sent and that certain erasures 
have occurred. The erasure recovery algorithm works as follows. We first initialize 
the contents of the right hand nodes of G with zero. Then we collect the non- 
erased coordinate positions, add their value to the current value of their right 
neighbors, and delete the left node and all edges emanating from it from the 
graph. After this stage, the graph consists of the erased nodes on the left and 
the edges emanating from these nodes. In the next step we look for a right node in 
the graph of degree one, i.e., a node that has only one edge coming out of it. We 
transport the value of this node to its unique left neighbor £, thereby recovering 
the value of eg. We add eg to the current value of all the right neighbors of £, 
delete the edges emanating from £, and repeat the process until we cannot find a 
node of degree one on the right, or until all nodes on the left have been recovered. 

It is obvious that, on a RAM with unit cost measure, the amount of arithmetic 
operations to finish the algorithm is at most proportional to the number of edges 
in the graph, i.e., to nan, where an is the average node degree on the left. The 
aim is thus to find graphs with constant an for which the recovery algorithm 
finishes successfully. 

The main contribution of [8] was to give an analytic condition on the maxi- 
mum fraction of tolerable losses in terms of the degree distribution of the graph. 
More precisely, define the left and the right degree of an edge in the graph as 
the degree of the left, resp. right node it is emanating from. Further, denote 
by Ai and pi the fraction of edges of left, resp. right degree i, and consider the 
generating functions A(a;) := and p{x) := Pix'‘~^ . In the following 

we will call the pair (A, p) a degree distribution. 
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Table 1. Rate 1/2 codes built from heavy tail/Poisson distribution. 



N 


an 


6 


5/{l-R) 


(5 


,5 


S/5 


8 


5.9266 


5.9105 


0.91968 


0.45984 


0.49085 


0.93682 


16 


7.0788 


7.0729 


0.95592 


0.47796 


0.49609 


0.96345 


27 


8.0054 


8.0027 


0.97256 


0.48628 


0.49799 


0.97648 


47 


9.0256 


9.0243 


0.98354 


0.49177 


0.49902 


0.98547 


79 


10.007 


10.007 


0.98990 


0.49495 


0.49951 


0.99087 


132 


10.996 


10.996 


0.99378 


0.49689 


0.49975 


0.99427 


221 


12.000 


12.000 


0.99626 


0.49813 


0.49988 


0.99650 



The main theorem of [8] states that if the graph is chosen at random with 
degree distribution (A, p) and if the erasures occur at random positions, then 
the above erasure recovery algorithm can correct a (5-fraction of losses if p(l — 
S\{x)) > 1 — X for cc G [0, 1]. In a later paper [6], this condition was slightly 
relaxed to 

(5A (1 — p(l — x)) < X for xG (0,(5]. (1) 

The paper [8] further exhibited for any e > 0 and any rate R an infinite 
sequence of degree distributions (A, p) giving rise to codes of rate at least R such 
that the above inequality is valid for (5 = (1 — — e), and such that the average 
left degree of the graph is 0(log(l/e)). In other words, S can get arbitrarily close 
to its optimal value {1 — R) with a “logarithmic” sacrifice in the running time 
of the algorithm. Explicitly, for any given 0 < e < 1, one chooses an integer N 
close to 1/e and considers the pair (Xn,Pn) with 

1 N—1 f. 

= ff(ivTT) E T- 

where 9^ is chosen to make the fraction of the nodes on the left and the right 
equal to 1 — i?, and H{N — 1) is the harmonic sum This sequence 

is referred to as the heavy tail/Poisson sequence in the following. Table 1 gives 
an example of the performance of this sequence for some codes of rate 1/2. In 
that table, an denotes the average right degree, 6 is the maximum tolerable 
loss fraction, and (5 is a theoretical upper bound on 6, see Remark 1 following 
Corollary 1. 

In the following we will show that this sort of relationship between the average 
degree and the maximum tolerable loss fraction is the best possible. Furthermore, 
we will exhibit a new sequence of degree distributions for which the same type of 
relationship holds between the average degree and the maximum tolerable loss 
fraction. 
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3 Upper Bounds for the Maximum Tolerable Loss 
Fraction 



In this section we will prove some upper bounds on the maximum tolerable loss 
fraction 6 for which the algorithm given in the previous section is successful when 
applied to a random graph with a given degree distribution. Our main tool will 
be the inequality (1). We will first need a preliminary result. 

Lemma 1. Let G be a bipartite graph with edge degree distribution (A,p). Then, 
the average left, resp. right, node degree of G equal 

1 1 
/gA(a;)dx’ p{x)dx’ 

respectively. Let the polynomials A and R be defined by 



A{x) 



Jo Mi)dt 

fgMt)dt’ 



R(x) 



fo P(i)di 
fo dWdi 



Then the coefficient of x* in A(x), resp. R{x) equals the fraction of nodes of 
degree i on the LHS, resp. RHS of G. 



Proof. Let L denote the number of nodes on the LHS of G. Then the number of 
edges in G equals ulL, and the number of edges of degree i equals XiOlL. Hence, 
the number of nodes of degree i on the LHS of the graph equals XiarL/i, since 
each such node contributes to exactly i edges of degree i. Hence, the fraction of 
nodes of degree i equals XiOhji. Since the sum of all these fractions equals 1, we 
obtain af} = A^/i = fg X(x)dx and this yields both assertions for the LHS 
of the graph. The assertions on the RHS follow similarly. □ 



The following theorem shows a rather strong upper bound on S. 

Theorem 1. Let X and p denote the right and left edge degree distributions of 
a bipartite graph. Let 5 be a positive real number such that 



5X (1 — p{l — x)) < X 



for 0 < X < S. Then we have 

ClR 

where ol and ur are the average node degrees of the graph on the left, resp. right, 
hand side. 

Proof. As a real valued function the polynomial A(a;) is strictly increasing for 
positive values of x. Hence it has a unique inverse A“^ which is also strictly 
increasing. The first of the above inequalities is thus equivalent to 

1 — p(l — x) < X~^ (x/S) 
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for 0 < a: < 5. Integrating both sides from 0 to 5 and simplifying the expressions, 
we obtain 

/»1 /» 1 — 5 /»1 



/ p{x)dx — / p{x)dx A(x)dx. 



Jo Jo Jo 

(Note that p A~^ (a;)da: = 1 — Jq A(x)dx.) Invoking the previous lemma, we thus 
have 



(5< 



aL 

ttL 

a-H 



pi — s 



p(x)da 



^ _ /o '^p(a:)da: \ 

\ fo P(a;)da; J 






where the polynomial R is defined as in the statement of Lemma 1. To finish 
the proof we only need to show that R(1 — 5) > (1 — 5)“^. To see this, first note 
that if oi , . . . , ttM are nonnegative real numbers adding up to 1 and /r is any 
nonnegative real number, then 



^ Oj/r* > /rSi “b (3) 

i 

This follows from taking logarithms of both sides and noting that the log-function 
is concave. Denoting by Ui the coefficients of R(x) and setting ^ := 1 — S, this 
gives our desired inequality: the at are by the previous lemma the fractions of 
nodes of degree i, and hence iui is the average right degree o/j. □ 



Corollary 1. With the same assumptions as in the previous theorem, we have 



a-R 



Proof. Follows from S < ahjaR. 



□ 



Remark 1. A more refined upper bound for 5 is given by the following. Let r 
denote the quantity ahjaR, i.e., one minus the rate of the code. Suppose that 
rr> \ j r — an assumption that is automatically satisfied if the average left degree 
is larger than one — and consider the function f(x) = x— r(l— (1 — x)“”). We have 
/(O) = 0, /(I) = 1 — r > 0, /'(O) < 0, and f{x) has exactly one root. Hence, / 
has exactly one nonzero root, denoted by 6a^,r and this root is between 0 and 1. 
The previous theorem asserts that the maximum <5 satisfying the inequality (1) 
is smaller than i5oR.ai,/aR- 

The following definition is inspired by Corollary 1. 

Definition 1. A sequence of degree distributions giving rise to codes 

of rate R> Rq for a fixed Rq > 0 is called asymptotically quasi-optimal if there 
exists a constant p, (possibly depending on R) and for each m there exists a 
positive Sm with — yOm(l — a^)) < X for X G (0,(fm] such that the average 

right degree qr satisfies qr < /ilog(l — <5/(1 — R)). 
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The significance of quasi-optimal is the following: we want to construct codes 
that can recover as many erasures as possible in as little as possible time. The 
running time of the erasure recovery algorithm is proportional to the block- 
length of the code times qr, the average right degree. Hence, we are looking for 
a trade-off between qr and S. Corollary 1 shows that we cannot make aR too 
small. In fact, it implies that qr > log(l — <5/(1 — i?))/log R. However, we might 
hope that we can make qr smaller than log(l — 5/(1 — R)) times a (negative) 
constant fx. In this way, we have maintained a qualitative optimality of the code 
sequence in the sense that the running time increases only logarithmically with 
the relative increase of the erasure correction capability. 

The heavy tail/Poisson sequence (2) is asymptotically quasi-optimal as is 
shown in [8]. Starting in the next section, we will introduce another quasi-optimal 
sequence which has certain advantages over the heavy tail/Poisson sequence. 

We close this section by stating another useful upper bound on the maximum 
possible value of 6. 

Lemma 2. Suppose that A and p are polynomials and S is such that 5A(1 — p(l — 
x)) < X for X € (0, 5]. Then S < p'(l)/A'(0). 

Proof. Let f{x) = 5A(1 — p{l — x)) — x. By assumption, f{x) < 0 for x G (0,5]. 
This implies that f{0) < 0, where f'{x) is the derivative of / with respect to x. 
But /'(O) = 5A'(0)p'(l) — 1, which gives the assertion. □ 

4 Fractional Binomial Coefficients 

As can be seen from their description (2), the heavy tail/Poisson sequence is 
closely related to the Taylor series of — ln(l — x). The new sequence that we will 
describe in the next section will be related to the Taylor expansion of (1 — x)“ 
where 1/a is an integer. The coefficients of this expansion are fractional binomial 
coefficients. For this reason, we will recall some well-known facts about these 
numbers. 

For real a and a positive integer N we define 
/ a A a{a — 1) • • • (a — -I- 1) 

\n) m 

For convenience, we also define (])) := 1. We have the Taylor expansion 

l-(l-x)“ = £ (4) 

k=l A / 

Note that for 0 < a < 1 the coefficients of the right hand power series above are 
all positive. Furthermore, we have 

|(:)(-f« = i-A(;)(-i)-. (5) 
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and 

k—1 ^ 

as can easily be proved by induction. 

In the following, we will derive some estimates on the size of the binomial 
coefficients. 

Proposition 1. Let a he a positive real number less than 1/2 and let N >2 be 
an integer. There is a eonstant c independent of a and N such that 



ca 




(_1)JV+1 



< 



a 

_/V“+i ’ 



Proof. First note that (^) (— is positive. Taking its logarithm and expand- 
ing the series — ln(l — x) = obtain 

/ / \ \ -^ 12^1 

in ; (-!)»« 

^ ^ ^ ^ fc=l k=l 

(Note that the series involved are absolutely convergent, so that we can rearrange 
the orders of the summations.) For an upper bound on this sum we replace 
'^k=i 1 /^* by 1 for s > 2 and use the left hand side of the inequality 

^ 1 1 
In(A^) + 7 < XI ^ + '^ + 

where 7 is Euler’s constant [5, pp. 480-481]. For the lower bound we use the 
right hand side of the above inequality and replace X)fc=i 1 /^* for s > 2 by 2 . 
This gives, after exponentiation. 



a 



_/V“+i 



2 »),i _ - a). 



Noting that 0 < a < 1/2, we obtain our assertion. 



□ 



5 Right Regular Sequences 

A closer look at the proof of Theorem 1 reveals the following: one of the reasons 
we might obtain a lower than optimal upper bound for the largest b satisfying ( 1 ) 
is that the inequality (3) is not sharp. In fact, that inequality is sharp iff all 
the Oi except for one are zero, i.e., iff p{x) has only one nonzero coefficient, 
i.e., iff the graph has only one degree on the right hand side. We call such 
graphs right regular in the following. We will study in this section right regular 
degree distributions and design the left hand side in such a way as to obtain 
asymptotically quasi-op timal sequences. We remark that right regular sequences 
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were previously studied in an unpublished manuscript [9] . Our approach differs 
however from the one given in that paper. 

From a theoretical point of view, codes obtained from these sequences should 
perform better than the heavy tail/Poisson distribution, since they allow for a 
larger loss- fraction for a given rate and a given average left degree. Furthermore, 
the analysis given in [6] suggests that the actual performance of the code is 
related to how accurately the neighborhood of a message node is described by a 
tree given by the degree distributions. For instance, the performance of regular 
graphs is much more sharply concentrated around the value predicted by the 
theoretical analysis, than the performance of irregular graphs. Hence, if the graph 
is right regular, one should expect a smaller variance in the actual performance 
than for corresponding irregular codes. 

For integers a >2 and N > 2 we define 



Pa{x) := \ 



^a,N{x) 



:= a 






(8) 



where here and in the following we set a := l/(a— 1). Note that Pa(l) = 1, that 
Aa.Ar(l) = 1 by (5), and that Aa,N has positive coefficients. Hence, (paEo.Ar) 
indeed defines a degree distribution. 

Let < 1 be a positive constant. The sequence we are interested in is given 
by the following degree distributions: 



Pa(x), (9) 

First we consider the rate of the codes defined by these distributions. 

Proposition 2. The rate R = Ra,tt of the code defined by the distributions 
in (9) satisfies 

1 — V 1 — ciy 

1 — cvv^l°^ ~ “ 1 — vv^l°^ ’ 

where c is the constant from Proposition 2. 

Proof. In the following, we denote by N the quantity and by X{x) the 

function Xa,N{x). We need to compute the integral of A(x). We have 



A(a;)da 



g g-(“) (-1)^+1 
a-f la- A^(“)(-l)^+i' 



Further, pa(x)dx = 1/a 



a/ {a + 1). Hence, the rate of the code is 



g-jV(-)(-ir+i 



:= 1 - ra,N- 



Next, we estimate ra,N using Proposition 2 again: 



( 10 ) 



1 - 
1 — 



< Ta.N < 



1 — cv 
1 - vv^/°‘ ' 



This gives the desired assertion on the rate. 



□ 
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Theorem 2. The sequence of degree distributions given in (9) is asymptotically 
quasi- optimal. 

Proof. Let us compute the maximum value of S such that <5A(1 — p(l — x)) < x 
for X e (0,(5), where (A,p) is the pair given in (9). We have 

Sa 

= a- 

So, for (5A(1 — p(l — x)) < X we need 

a 

This is, by the way, the same upper bound as the one obtained from Lemma 2. In 
the following we assume that S is equal to the above upper bound, and compute 
6/{l — R), R being the rate of code estimated in Proposition 2: 

J g- (-) (-1)^+1 1 

1-i? a - iV“+i' 

Ignoring diophantine constraints, we assume in the following that N = 

This gives 

-J— > 1 _ = 1 _ 

1 - i? - 

where ur = a = (a + l)/a is the average right degree of the code. This proves 
the assertion. □ 

In practical situations, one is interested in designing a code of a given fixed rate. 
Since the parameter a in the definition of the sequence in (9) is the inverse 
of an integer, the range of this parameter is discrete. Hence, we do not have 
a continuous range of rates for these codes. However, we can come arbitrarily 
close to our desired rate by making a smaller, i.e., by allowing high degrees 
on the RHS of the graph. Some examples for different target rates are given in 
Table 2. The value of N has been hand-optimized in these examples to come as 
close to the desired rate as possible. The last column in that table corresponds 
to the value 6 defined in Remark 1 following Corollary 1. It gives a theoretical 
upper bound on the best value of S. As can be observed from the table, the 
maximum value of 6 converges very quickly to the maximum possible value. 
Also, a comparison between these codes and the heavy tail/Poisson distribution 
in Table 1 reveals that the new codes are better in terms of the trade-off between 
6 and or. However, the new codes need larger degrees on the left than the heavy 
tail/Poisson distribution. 

To obtain codes that have a fixed rate, one can modify a slightly. Such ex- 
amples are given in Table 3. Both parameters (a and N) of these codes are 
hand-optimized to give the best results. 
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Table 2. Right regular sequences for rates R close to 2/3, 1/2, and 1/3. 



N 


OLR 


1-R 


5/(1 - R) 


5 


5 


5/5 


2 


6 


0.33333 


0.60000 


0.20000 


0.29099 


0.68731 


3 


7 


0.31677 


0.74537 


0.23611 


0.28714 


0.82230 


6 


8 


0.32886 


0.88166 


0.28994 


0.31243 


0.92801 


11 


9 


0.33645 


0.93777 


0.31551 


0.32690 


0.96514 


17 


10 


0.33357 


0.96001 


0.32024 


0.32724 


0.97860 


27 


11 


0.33392 


0.97502 


0.32558 


0.32984 


0.98711 


42 


12 


0.33381 


0.98401 


0.32847 


0.33113 


0.99197 


64 


13 


0.33312 


0.98953 


0.32963 


0.33134 


0.99484 


13 


6 


0.50090 


0.96007 


0.48090 


0.49232 


0.97679 


29 


7 


0.50164 


0.98251 


0.49287 


0.49759 


0.99052 


60 


8 


0.49965 


0.99159 


0.49545 


0.49762 


0.99563 


12 . 


9 


0.49985 


0.99598 


0.49784 


0.49885 


0.99797 


257 


10 


0.50000 


0.99805 


0.49903 


0.49951 


0.99904 


523 


11 


0.50002 


0.99904 


0.49954 


0.49977 


0.99953 


1058 


12 


0.49999 


0.99953 


0.49975 


0.49986 


0.99977 


111 


6 


0.66677 


0.99698 


0.66475 


0.66584 


0.99837 


349 


7 


0.66667 


0.99904 


0.66603 


0.66636 


0.99950 


1077 


8 


0.66663 


0.99969 


0.66642 


0.66653 


0.99984 


3298 


9 


0.66669 


0.99990 


0.66662 


0.66665 


0.99995 



6 Conclusions and Open Questions 

In this paper we have analyzed the theoretical performance of a simple erasure 
recovery algorithm applied to Gallager codes by deriving upper bounds on the 
maximum fraction of tolerable losses given a parameter that describes the run- 
ning time for encoding and decoding of the codes. We have shown that there is 
a trade-off between proximity to the optimal value of tolerable losses, i.e., one 
minus the rate of the code, and the average degree of nodes in the graph, in 
the sense that multiplying the average degree by a constant factor implies an 
exponential relative increase of the maximum tolerable loss fraction. Further, 
we have introduced a new sequence of graphs which are asymptotically close to 
optimal with respect to this criterion. These graphs are right regular and their 
node degree distribution on the left hand side is closely related to the power 
series expansion of (1 — x)“, where a is the inverse of an integer. Previously, 
the only known such sequence was the heavy tail/Poisson sequence introduced 
in [8]. We have included examples which show that the new codes tolerate a 
higher fraction of losses if the average degree of the graph is fixed. 

It would be very interesting to extend the analysis given in this paper to 
other decoding algorithms, e.g., to the simple decoding algorithm of Gallager 
for error correcting codes [4,7]. Such an analysis would probably give clues for 
constructing infinite sequences of graphs that perform asymptotically optimally 
with respect to the decoding algorithm in question. 
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Table 3. Rate 1/2 codes, p{x) = Aa, ,n{x) as defined in (8) for arbitrary a. 



N 


a 


Or 


5/(1 - R) 


5 


5 


5/5 


11 


0.17662 


6 


0.95982 


0.47991 


0.49134 


0.97673 


27 


0.16412 


7 


0.98190 


0.49095 


0.49586 


0.99009 


59 


0.14225 


8 


0.99142 


0.49571 


0.49798 


0.99543 


124 


0.12480 


9 


0.99596 


0.49798 


0.49901 


0.99794 


256 


0.11112 


10 


0.99800 


0.49900 


0.49951 


0.99898 


521 


0.09993 


11 


0.99896 


0.49948 


0.49975 


0.99945 


1057 


0.09090 


12 


0.99950 


0.49975 


0.49988 


0.99974 
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Abstract. We introduce and analyze a new statistical ensemble of low- 
density parity-check convolutional (LDC) codes. The result of the anal- 
ysis are bounds, such as a lower bound for the free distance and upper 
bounds for the burst error probability of the LDC codes. 



1 Introduction 

In the past few years there has been an enormous interest in low-density convo- 
lutional (LDC) codes, combined with iterative decoding algorithms, due to that 
such systems have been proved to achieve low bit error probabilities even for 
signal-to-noise ratios close to the channel capacity. The most well known results 
in this direction are the simulation results obtained by Berrou et al. [1] for the 
so called turbo-codes. 

Low-density parity-check block codes were introduced in the 60s by Gallager 
[2]. The generalization of Gallager’s codes to low-density convolutional codes was 
presented in [3]. Our earlier work in this area [4,5,6] focused on the mathematical 
description of, and a general construction method for LDC codes, together with 
an iterative decoding algorithm for these codes. 

In this paper we introduce a new statistical model of LDC codes, that can 
be investigated analytically. The analytic approach, that we take, is made in 
terms of statistical ensemble analysis, and the resulting theoretical tools are 
bounds on the free distance, and on the average bit and burst error probabilities 
of the ensemble considered, if a maximum likelihood sequence estimator were 
used in the decoder. We model the statistical ensemble through the use of a 
device that we call a Markov scrambler, and which is described in the paper. 
The performance analysis of the ensemble is then reduced to the calculation of 
a refined average weight distribution through the study of a Markov process. To 
illustrate the method, we perform the analysis on some different classes of LDC 
codes, and compare the results. 

2 Code Description 

To introduce the ensemble of LDC codes, we modify some of the definitions in [6] . 
A rate R = b/c LDC code is a convolutional code having a sparse syndrome for- 
mer matrix , where H is the parity-check matrix and T denotes transposition. 
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If each row of contains v ones and each column contains (1 — i?) ones, 
the LDC code is called a homogeneous (y^vj (1 — i?))-code. The code sequence 
V = ... ,V-i,vo,vi, . . . ,vt,..., where Vt = vtc, ■ ■ ■ ,V(^t+i)c-i, Vi G GF(2), 
i G Zi, satisfies the equality = 0 . This equation can be written in recursive 
form [7] as 



ms(t) 

^ {t) = 0 , (1) 

i=0 

where (t), i = 0,1,. . . ,ms{t) are binary c x {c — b) submatrices, such that 
Hq (t) has full rank, and (f) yf 0. Even though the syndrome former 

memory nis is fixed in practice, we will suppose that it is a random vari- 
able depending on t. Equation (1) defines the code subblock Vt if subblocks 
V(_i, vt_ 2 , . . . , ■Wt-ms(t) are known. We will suppose that the first b symbols of Vt 
coincide with the fth subblock of information symbols, Ut = utb, ■ ■ ■ ,U(t+i)b-ij 
Ui G GF (2), at the encoder input, i.e. the encoder is systematic. 

In [6] a turbo-code was defined as a LDC code whose syndrome former matrix 
can be represented as the product of a convolutional scrambler S' and the 
syndrome former matrix of a basic (component) code i.e. A 

convolutional scrambler S is defined as an infinite matrix S = (sij), i,j G Z, 
Sij G {0, 1}, that has one one in each column and at least one one in each row, 
and that satisfies the causality condition. If all rows have the same weight rj 
(number of ones), then the scrambler is called homogeneous. We also consider 
semi-homogeneous scramblers, when all even rows of have weight rj, all 
odd rows have weight ij — 1, and all columns have equal weight. An alternative 
definition of a scrambler is that it is a device that maps input c-tuples onto 
output d-tuples, d > c, by permuting the input symbols and making copies of 
(some of) them. The ratio d/c is called the scrambler rate i?s- The rate R of 
the turbo-code is i? = 1 — Rs{l — Rh), where i?b is the rate of the basic code. 
The memory M of the scrambler is defined as the number of symbols that the 
decoder keeps in its memory. 

We have studied two classes of turbo-codes, A and B. In class A, a rate Rs = 
d/c convolutional scrambler is followed by a rate Rh = {d—l)/d degenerated 
component convolutional encoder of memory zero. (It calculates one parity-check 
symbol to d — 1 input symbols.) In class B, a rate Rs = d/c convolutional 
scrambler is followed by a rate Rh = {d — c + b) /d component convolutional 
encoder. To simplify the description in this paper we consider only rate R = 1/2 
LDC codes. 



3 Statistical Ensemble of Turbo-Codes 

To define a statistical ensemble of turbo-codes, we first introduce a statistical 
ensemble of rate Rs = d/c {c = 2) memory M Markov scramblers. The simplest 
way to describe this ensemble of scramblers is to represent the scrambler as a 
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box that contains M bits.^ Initially the box is filled with M dummy zeros. When 
the information subblock Ut oi b = I information bits enters the LDC encoder 
input, the encoder randomly picks d — c = c{Rs — 1) = 2{Rs — 1) bits from the 
box and replaces them hy d — c = d— 2 new bits, generated by the encoder. For 
example, in case A the basic encoder picks, at each time instant t, 2{Rg — 1) 
bits (i?s — 1 is a positive integer) from the box and the information bit Ut and 
calculates the parity-check symbol of subblock Vt- Then (i?s — 1) copies of both 
2 bits of subblock Vt are put into the box to replace the 2{Rg — 1) bits, picked 
before. In case B, the rate Rh = 2/3 {d = 3) component encoder picks, at each 
time instant t, one bit from the box and replaces it by the new information bit 
Ut- The input of the component encoder is the information bit and the bit picked 
from the box. The output is the information bit and the parity-check bit. 

Let at{l, d) be the mathematical expectation, in the ensemble of turbo-codes, 
of the number of weight d paths that depart from the allzero path at time instant 
t and merge again with the allzero path at time {t + 1). Since the statistical 
properties of the ensemble do not depend on t, we will skip the index t and 
consider the case t = 0. To calculate the average spectrum a{l,d) we can use 
recurrent equations. 

Example 1. (Class A, rate i? = 1/2 LDC (2,4)-code) 

This code can be represented as a turbo-code with a rate i?s = 4/2 {d = 4) 
convolutional scrambler. Let p, (p is even), the number of ones in the scrambler, 
be the scrambler state at time instant t. To the allzero path corresponds the 
sequence of zero states. To a path which departs from the allzero path at moment 
t = 0 and merges again with the allzero path at moment t = I corresponds a 
sequence of states po, pi, P 2 , ■ ■ ■ , Pi, where po = pi = 0 and pi ^ 0,0 < i < 1. 
Let f{p, I, d) be the mathematical expectation of the number of paths of weight 
d, which start at moment t in state p and reach state zero for the first time at 
moment t + 1. Since any path departing from the allzero path has to come to 
state p = 2 first, and since the transition from the zero state to state p = 2 has 
weight 2, 



a{l,d)=f{2,l-l,d-2). (2) 

The state-transition diagram of the encoder is presented in Fig. 1. All branches 
depart from state p. They are labeled by the probability of the corresponding 
transition and the generated code subblock Vt- From state p the scrambler can 
come to the states p, p + 2 or p — 2, as illustrated by the arrows in the figure, 
and the weight of the generated subblock can be 0, 1 or 2. 

For example, the lower right branch corresponds to the transition from state 
p to state p + 2 when the encoder generates the subblock Vt = 11. This hap- 
pens when the basic encoder picks two zeros from the box (with probability 
^ nonzero information symbol enters the encoder. Using 
this state-transition diagram we can get the following system of recurrent equa- 

^ A more formal definition of Markov scramblers will be given in [8]. 
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Fig. 1. State-transition diagram of the encoder in Example 1. 



tions 



/(m, I, d) = if if,, l-l,d) + f{^l + 2,1-1, d- 2)) 



M{M -1) 
(M - 



M(M- 1) 

- 1 ) 



fifj.,1- l,d-l) 



(3) 



{f{^i-2,l-l,d)+f{fi,l-l,d-2)), 1^0 



M{M - 1) 

"l, / = 0, d = 0 

0, otherwise 
f{M,l,d)= f{M,l-l,d-2) + f{M -2,l-l,d) . 



f{0,l,d) = 



(4) 

(5) 



The recurrent solution of (3)-(5) gives the average spectrum a(l,d). The gener- 
alization to other codes of case A (i?s = 2.5, 3, 4) is straightforward. 



Example 2. (Class B, rate i? = 1/2 turbo-code with a time-invariant convolu- 
tional, rate i?b = 2/3, component code of syndrome former memory 2) 

Consider the case when the submatrices of the parity-check matrix of the time- 
invariant component code are given as 

i?o = 111, = 101, i?2 = Oil (6) 

and the first symbol in subblock Vt = V 3 t,V 3 t+i,V 3 t +2 is the information symbol 
Ut, entering the turbo-encoder, v^t+i is the bit that the component encoder picks 
from the scrambler box and V 3 t +2 is the parity-check symbol generated by the 
component encoder. The syndrome former of the component encoder can be in 
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one of 4 states s, s = 0, 1, 2, 3 (0 = 00, 1 = 01,2=10,3= 11). Together with the 
state of the scrambler /r, i.e. the number of ones within the box, the state s of the 
syndrome former of the component code defines the state of the turbo-encoder. 
By definition, the zero state corresponds to /r = 0,s = 0. Let f{^,s,l,d) be 
the mathematical expectation of the number of paths of weight d, which start 
at some moment t in state (/r, s) and reach the zero state for the first time at 
moment t + 1. Since any path departing from the allzero path has to come to 
state (1,1) first, and since the transition from the zero state to this state has 
weight 2, the average spectrum of the turbo-code is 

a{l,d)= 1,1-1, d-2) . (7) 

The state-transition diagram of the encoder is presented in Fig. 2. The branches 
are labeled analogously to Example 1. We emphasize, that all branches depart 
from state (/i, •) and end in the state that the arrows in the figure point to. 




Fig. 2. State-transition diagram of the encoder in Example 2. 



Using this state-transition diagram we can obtain a system of recurrent equa- 
tions, analogous to system (3)-(5). The solution of this system gives the average 
spectrum a{l,d). The generalization to other codes of class B (with component 
codes of syndrome former memory 3,4 and 5) is straightforward. 



4 Ensemble Analysis and Simnlation Results 

Ensemble analysis is widely used in coding theory. Often it can be hard to 
analyze a specific code, but easier to analyze the average behavior of a collection 
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of codes. Obviously, at least one of the codes in the ensemble performs better 
than, or the same as, the average code. 

The solution of the systems of recurrent equations describing the behavior of 
the ensembles of turbo-codes of classes A and B gives us the average spectrum 
of each ensemble a{l,d). Actually, in this paper we are more interested in the 
refined average spectrum a{d) = 'Y^ia{l,d). We get this spectrum if we leave 
out the argument I in the system of recurrent equations. It is obvious that if we 
find d = d ( tt), 0 < TT < 1, such that 



d-l 

^a(d) < 1 - 7T , (8) 

at least a fraction tt of the codes in the ensemble has free distance d^ee not less 
than d. The calculation of d for different codes gives us a Costello-type lower 
bound for the free distance of different codes. In Fig. 3 lower bounds for the free 
distance of some different codes are given as a function of the scrambler size M. 
It is worth to note that for the LDC (2,4)- and (2.5,5)-codes of class A (the latter 
ones have a semi-homogeneous scrambler with three ones in even rows and two 
ones in odd rows) and for the codes of class B, the bound grows logarithmically 
with M. For the LDC (3,6)-code of class A it grows linearly. 




scrambler size M 

Fig. 3. Lower bounds on the free distance. The dashed lines correspond to (from 
bottom to top) (2,4), (2.5,5) and (3,6) codes of class A. The solid lines correspond 
to class B codes with memory 2,3,4 and 5. 
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Using the average spectrum a (d) we can easily get an upper bound for the 
average burst error probability Pb- In fact, if the transmission is over an additive 
white Gaussian noise (AWGN) channel with signal-to-noise ratio Es/Nq, then 

pB<Y,a (d) , (9) 

d 

where D = exp {—Es/Nq). We note that the bound (9) can be calculated by 
directly solving the system of equations, recurrent in fx, with respect to the 
function 



Fd {^J-) = EE f{d,l,d)D<i . (10) 

l d 

The upper bound (9) for the burst error probability can be improved if we 
expurgate bad codes from the ensemble, i.e. codes that have free distance less 
than d (tt), where d (tt) satisfies (8). Then the bound for the average burst error 
probability over the expurgated sub-ensemble of codes (expurgated bound) is 

^B.exp < - V' a{d)D‘^ . (11) 

TT 

d'>d{'7T) 

Results of the calculation of the upper (union) bound (9) and expurgated bound 
(11) with 7T = 1/2 (only for class A codes) for the burst error probability are 
presented, together with simulation results of the burst error probability, in Fig. 

4 and 5, and simulation results of the bit error probability are presented in 
Fig. 6 and 7. Note, that the bounds are average ensemble bounds for maximum 
likelihood decoding, while the simulations have been performed with a specific, 
randomly chosen, LDG code, using an iterative decoding procedure. In principle, 
we can use the refined state-transition diagram of the ensemble for calculation 
of upper bounds for the bit error probability, analogously to the case of “usual” 
convolutional codes [7]. 

5 Conclusion 

In this paper we presented preliminary results of the statistical analysis of a 
wide class of LDG codes. The results are lower bounds for the free distance and 
upper bounds for the error probability. The bounds for the error probability 
that we have calculated so far are non-trivial only for relatively large signal- 
to-noise ratios. In future work we are planning to achieve bounds for smaller 
signal-to-noise ratios, close to the Shannon limit. 
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Fig. 4. Class A: burst error probabilities. The solid lines show (from top to 
bottom) simulation results for nis = 129,257,513,1025,2049,4097. The union 
bound (dashed-dotted) and the expurgated bound (dotted) are shown for mg = 
129. 
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Fig. 5. Class B: burst error probabilities. The solid lines show (from top to bot- 
tom) simulation results for nis = 1024,4096,16384. The union bound (dashed- 
dotted) is shown for nis = 1024. 

(2.4) (2.5,5) 







Fig. 6. Class A: bit error probabilities. The solid lines show (from top to bottom) 
simulation results for rris = 129, 257, 513, 1025, 2049, 4097. 
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Fig. 7. Class B: bit error probabilities. The solid lines show (from top to bottom) 
simulation results for rus = 1024,4096, 16384. 
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Abstract. The nonlinear congruential method is an attractive alterna- 
tive to the classical linear congruential method for pseudorandom number 
generation. In this paper we present a new type of discrepancy bound for 
sequences of s-tuples of successive nonlinear multiple recursive congruen- 
tial pseudorandom numbers of higher orders. In particular, we generalize 
some recent results about recursive congruential pseudorandom numbers 
of first order. 



1 Introduction 

In this paper we study some distribution properties of pseudorandom number 
generators defined by a recurrence congruence modulo a prime p of the form 

■u„+i = /(u„, . . . ,M„_m+i) (mod p), n = m - l,m, . . . , (1) 

with some initial values uq, . . . , Wm-i, where f{Xi , . . . , Xm) is a rational function 
of m variables over the field Fp of p elements. We also assume that 0 < tt„ < p, 
n = 0, 1, . . .. Composite moduli have also been considered in the literature, but 
we will restrict our attention to prime moduli. 

It is obvious that the sequence (I) eventually becomes periodic with some 
period t < p™. Throughout this paper we assume that this sequence is purely 
periodic, that is, that u„ = Un+t beginning with n = 0, otherwise we consider a 
shift of the original sequence. 

These nonlinear congruential generators provide a very attractive alternative 
to linear congruential generators and, especially in the case m = 1, have been 
extensively studied in the literature (see [4,15] for surveys). Although linear con- 
gruential generators are widely used, they have several well-known deficiencies. 
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such as machine-dependent upper bounds on the period length and an unfavor- 
able lattice structure (see [9,12,13,15]). On the other hand, nonlinear congru- 
ential generators, and especially inversive congruential generators, tend to have 
a lesser amount of intrinsic structure and are thus preferable in this respect. 
Furthermore, by working with recurrences of order m > 1 in the nonlinear con- 
gruential method, we can overcome the restriction in first-order recurrences that 
the period length cannot exceed the modulus. For these reasons, it is of interest 
to study nonlinear congruential generators of higher orders. 

When TO = 1, for sequences of the largest possible period t = p, a, number 
of results about the distribution of the fractions Un/p in the interval [0, 1) and, 
more generally, about the distribution of the points 

( 2 ) 

\ P P J 

in the s-dimensional unit cube [0, 1)'* have been obtained in the case where n runs 
through the full period, n = 0,...,p — 1. Many of these results are essentially 
best possible. We refer to [4,5,8,9,10,13,14,15,16,17,18,19] for more details and 
precise references to original papers. The case of periods t < p is of interest as 
well. 

Quite recently, in the series of papers [8,16,17,18,19] a new method has been 
introduced and successfully applied to the case to = 1. In the present paper 
we show that the same method works for nonlinear generators of arbitrary or- 
der TO > 1. In particular, we obtain rather weak but nontrivial bounds on the 
discrepancy of the points (2) when n runs over a part of the full period. 

In the very special case where f{X) = X®, that is, for the power genera- 
tor, an alternative approach has been proposed in [6]. This approach, although 
it has produced quite strong results for the power generator, cannot be ex- 
tended to other nonlinear generators. Moreover, a combination of the approach 
of [8,16,17,18,19] and the present work with some additional considerations has 
been used in [7] to improve the results of [6]. 

2 Definitions and Auxiliary Results 

For a sequence of N points 



r — (71, n, . . . , 7 s,«)n=l 

of the half-open interval [0, 1)®, denote by Z\r its discrepancy, that is, 

\Tr{B) 



(3) 



Ar = sup 

BC[0,1) 



N 



- \B\ 



where Tr{B) is the number of points of the sequence B which hit the box 

[ai,/3i) X ... X [as,Ps) Q [0,1)" 
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and the supremum is taken over all such boxes. 

For an integer vector a = (oi , . . . ,ag) € we put 

S 

|a| = max |ai|, r(a) = TT max{|a*|, 1}. (4) 

We need the Erdds-Turdn-Koksma inequality (see Theorem 1.21 of [3]) for 
the discrepancy of a sequence of points of the s-dimensional unit cube, which we 
present in the following form. 

Lemma 1. There exists a constant Cg > 0 depending only on the dimension s 
such that, for any integer L > 1, for the discrepancy of a sequence of points (3) 
the hound 



1 1 



Ar<Cs\- + - 



1 



N 



N r(a) 

o<|a|<L ^ ' 



exp 27TZ ( 









holds, where |a|, r(a) are defined by (4) and the sum is taken over all integer 
vectors 



s — (oi) • • • ) Os) G 



with 0 < |a| < L. 

The currently best value of Cg is given in [1]. We put 



e(z) = exp(27Tzz/p). 



Our second main tool is the Weil bound on exponential sums (see [2] and Chap- 
ter 5 of [11]) which we present in the following form. 

Lemma 2. For any nonconstant polynomial F{Xi , . . . , Xm) G IFp[Xi, . . . , Xm] 
of total degree D we have the hound 



p 

'y ' e (F'(xi, . . . , Xm)) 






3 Discrepancy Bound 

Let the sequence (rt„) generated by (1) be purely periodic with an arbitrary 
period t. For an integer vector a = (ao, . . . , Og-i) € we introduce the expo- 
nential sum 

N-l / s-l 

-S'a(iV) = y]] e y]] ajUn+j 

n—0 yj—0 

We estimate these sums, and thus the discrepancy of corresponding se- 
quences, for polynomials which satisfy a certain special property which is de- 
scribed below. 
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We say that a polynomial /(Xi , . . . ,Xm) G lFp[-^i 7 • • • ,Xm] has a dominating 
term if it is of the form 



/(Xi , . . . , X„) = + 



d\ — 1 

E- 



dm 1 

■E' 

^m=0 






with some integers di > l,d 2 > 0, . . . , dm > 0 and coefficients G IFp with 

Cldi...dm 7^ 0- 

Theorem 1. If the sequence (u„), given by (1) generated by a polynomial 
f{Xi , . . . , Xm) G Fp[Xi, . . . , Xm] of total degree d> 2 and having a dominating 
term, is purely periodic with period t and t > N > 1, then the bound 

max \Sa{N)\ = O iog-i/2 W 

gcd(oo,...,as-i,p) = l V / 



holds, where the implied constant depends only on d and s. 



Proof. Select any a = (oq, . . • , a^-i) G with gcd(oo, • . . , as-i,p) 
obvious that for any integer fc > 0 we have 



N-1 


(s-1 \ 


Sa{N) - Y e 


1 E! O'j'^n+k+j 


n— 0 


Vi=o ) 



Therefore, for any integer K > 1, 

K\Sa{N)\ <W + K^, 



1. It is 



where 



W = 


N-1 K-1 

EEe 


1 '^j'^n+k+j 1 


N-1 

^E 


K-1 

Ee 


1 djUn+k+j 1 




n— 0 k—0 


Vi=o J 


n—O 


k^O 


Vi=o J 



Define the sequence of polynomials fk{Xi, . . . , Xm) G Wp[Xi,. . . ,Xm] by 
the recurrence relation 



/fc(Xi, . ..,Xm)=f (/fc-l(Xi, . . . , Xm), fk-m{Xi, Xm)) , 

k= 1,2,. where fk{Xi,. . . ,Xm) = Xi_fe, k = -m + 1, ... ,0. 

It is easy to see that fk is a nonconstant polynomial of total degree at most 
d’^ and that u„+k = fk {un, ■ ■ ■ ,Un-m+i), k = l,2,.... 

Accordingly, we obtain 



A^-1 


K-1 


(s-1 \ 


<nY 


Ee 


I ^ ^ fk+j ('^n; ■ • ■ 5 '^n— m+l) 1 


n—0 


k^O 


v-0 ; 



<N Y. 



K-1 / s-1 

Ee E ajfk+j{wi,...,Wm) 

k—0 yi— 0 



2 



K-1 /s-1 

= ^ E E ® E“j' ■ • • > “ fi+j 

k,l—Owi,...,Wm^Fp yj— 0 





Distribution of Nonlinear Recursive Congruential Pseudorandom Numbers 



91 



If fc = /, then the inner sum is trivially equal to p™. There are K such sums. 
Because / has a dominating term, the total degree of the polynomials fi, grows 
strictly monotonically with v = 1,2,.... Therefore we can apply Lemma 2 to 
the inner sum, getting the upper bound (lK+s- 2 pm-i /2 most sums. 

Hence, 

< KNp^ + 

and so 

( X / 2 

K~^ {kNp'^ + + K 

— Q jyl/2p(m-l/2)/2 _|_ ^ 



Select 



K = 



0.4 



logP 

logd 



Then after simple calculations we obtain the desired result. 



□ 



Let Ds{N) denote the discrepancy of the points (2) for n = 0, . . . , fV — 1. 

Theorem 2. If the sequence (u„), given by (1) generated by a polynomial 
f{Xi , . . . , Xjn) G Fp[Xi, . . . , Xm] of total degree d> 2 and having a dominating 
term, is purely periodic with period t and t > N > 1, then the bound 

Ds{N) = O (loglogp)*) 

holds, where the implied constant depends only on d and s. 



Proof. The statement follows from Lemma 1, taken with 



L = 



iVl/2p-™/2logl/2p 



and the bound of Theorem 1 . 



□ 



4 Remarks 

The results of Theorems 1 and 2 are nontrivial only for sufficiently large values 
of N, namely ior t > N > p™ log“^^^p with some fixed e > 0, and it would be 
very important to extend the range of N for which Ds{N) = o(l). 

We believe that Theorems 1 and 2 hold for much more general polynomials, 
not necessarily only polynomials with a dominating term. Obtaining such an 
extension would be very important. We note that for univariate polynomials 
this condition is automatically satisfied, thus our results indeed generalise those 
of [17]. 

It would be very interesting to extend the results of this paper to the case of 
nonlinear generators with rational functions 



f{Xu...,Xm) &^p{Xi,...,X^). 
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At the beginning of this paper we have mentioned two special generators 
in the case m = 1, namely the inversive generator and the power generator, for 
which essentially stronger results than in the general case are known, see [8,16,19] 
and [6,7], respectively. It is desirable to understand what are the analogues of 
these special generators in the case m>2 and extend the results of [6,7,8,16,19] 
to these analogues. 

Finally we remark that our method works for generators modulo a composite 
number as well. But one should expect weaker results because instead of the 
very powerful Weil bound one will have to use bounds on exponential sums with 
composite denominator which are essentially weaker, see [20]. 
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Abstract. We study a representation of Boolean functions (and more 
generally of integer- valued / complex- valued functions), not used until 
now in coding and cryptography, which yields more information than 
the currently known representations, on the combinatorial, spectral and 
cryptographic properties of the functions. 

Keywords: Boolean function, Fourier spectrum, cryptography. 



1 Introduction 

Let n be any positive integer. From a cryptographic and coding theoretic point 
of view, we are interested in Boolean (i.e. { 0 , l}-valued) functions defined on 
the set F2” of all binary words of length n. This set is viewed as an F2-vector 
space of dimension n (the addition in this vector-space is mod 2 and will be 
denoted by 0). Since { 0 , 1 } can be viewed either as a subset of Z or as the field 
02, we need to distinguish between the addition in Z (denoted by 0, the sum of 
several terms 61, • • • ,br will then be denoted by X)i=i addition in 02 

(denoted by 0, the sum of several terms 61, • • • ,br will be denoted by bi). 

The basic Boolean functions are the affine functions: 

/(xi, • • • ,Xn) = oi Xi 0 • • • 0 a„x„ 0 Oo = a • X 0 Oo 

where a = (oi, • • • , a„) € 02^ and oq € 02. The expression a • x = oi xi 0 • • • 0 
o^n Xn 0 ao is the usual inner product in 02™ . The set of all the affine functions is 
the Reed-Muller code of length 2 ” and order 1 (a Boolean function is identified 
with the 2”-long binary word of its values, assuming that some order on 0^ is 
chosen) . 

Important parameters of Boolean functions are: 

— the Hamming weight: w(/) is the size of the support of /, i.e. of the set 
{a G 02™ I /(a) = 1 }; the Hamming distance between two functions / and g 
is d{f,g) = w(/0g); 

— the Fourier spectrum, i.e. the data of all the values of the Fourier transform of 

/: /(a) = ^xeF2" • The related Fourier spectrum of the function 

Xf = (—1)-^ = 1 — 2 / is equal to )^ = 1 — 2 / = 2 ” do — 2 /, where <5o is the 
Dirac symbol at the all- zero word (recall that, in general, Sa{x) equals 1 if 
X = a and 0 otherwise). 

* AMS classification numbers: 06E30, 11T23, 94A60. 
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— the nonlinearity Nf of /, i.e. the Hamming distance between / and the set 
of affine functions. We have Nf = 2”“^ — | maxaeFa" lx7(o)|; 

The Boolean functions whose nonlinearity is maximum are called bent functions 
(cf. [9]). They are interesting from a coding point of view, since they correspond 
to the words of length 2" whose distance to the Reed-Muller code of order 1 is 
equal to the covering radius of this code. The bent functions on T 2 ” with n even 
have extra properties which make them also interesting from a cryptographic 
point of view. They are called perfect nonlinear [6] and characterized by the 
fact that, for any nonzero word a, the Boolean function x 1 -^ f{x) 0 f{x 0 a) is 
balanced (i.e. takes the values 0 and 1 equally often). 

We describe now the representations of Boolean functions currently used in cryp- 
tography and in coding theory. 

The truth-table of a Boolean function / is the one-dimensional table, indexed by 
the elements of F 2 ” (some order being chosen), whose entry at a G T 2 ” is f{a). 
The advantage of this representation is its simplicity together with the fact that 
the weight of / is directly computed from w(/) = /(“)■ 

But it has important drawbacks: 

— it does not give any information on the algebraic degree of the function and 
on the number of terms in its algebraic normal form (see below); 

— the affine functions have the same complexity as any other function; 

— it does not directly characterize perfect nonlinear functions. 

The Algebraic Normal Form A.N.F. (cf. [7]): 

/(xi,--- ,cc„) = ^ a„ I j ; a„ G F 2 (1) 

uGFa” \i=l / 

is the sum, modulo 2, of monomials in which each of the n binary variables 
has degree at most 1. For simplicity, we shall write x“ instead of 
The algebraic degree of / is the global degree of its A.N.F. The A.N.F. can 
be computed from the truth table with a complexity 0(n2") by a butterfly 
algorithm (cf. [4]); conversely, the truth table can be computed from the A.N.F. 
with the same complexity. We have a„ = where x = (xi, . . . , x„) ^ 

■u = (ui, . . . ,Un) Vi G {1, . . . , n} Xi <Ui. 

The qualities of the A.N.F. are the following: 

— it leads to an important class of linear codes: the Reed-Muller codes; 

— the complexity of the A.N.F. of a function is coherent with its algebraic 
complexity, i.e. its implementation with and/xor gates; 

but it has also some drawbacks: 

— there exists no simple formula which computes the weight of a function, 
given its A.N.F.; the only known formula, cf. [10], page 124 is complex; 

— a fortiori, there exists no simple formula which computes the Fourier spec- 
trum or the nonlinearity of a function, given its A.N.F.; 

— it does not directly characterize perfect nonlinear functions. 
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The Fourier spectrum defined above can be also considered as a representation. 
It can be computed from the truth table by a butterfly algorithm with complexity 
0(n2”) (cf. [4]). Its qualities are the following: 

— the weight of / is directly given by w(/) = /(O); 

— the affine functions are characterized by the fact that one of the values of 
the Fourier spectrum of \f is equal to ±2" and all the others are null; 

— perfect nonlinear functions are directly characterized by means of the Fourier 
spectrum of xf- f is perfect nonlinear if and only if, for every word a G F 2 ", 
the value at a of the Fourier transform of Xf is equal to ±22 . 

But this representation has at least two important drawbacks: 

— there is no simple characterization of the fact that a list (/ia)oe iti” of integers 
corresponds to the Fourier spectrum of a Boolean function; 

— the algebraic degree cannot be directly deduced from the Fourier spectrum. 

2 The Numerical Normal Form 

This representation of Boolean functions (and more generally of complex-valued 
functions) is similar to the Algebraic Normal Form, but with numerical coef- 
ficients. It has been used in the context of circuit complexity. We study this 
representation in a more systematic way which allows direct formulae for the 
weight and the Fourier spectrum. It leads to a characterization of perfect non- 
linear functions and to divisibility properties of the weight. Boolean functions are 
characterized with a single formula by means of this representation. The overcost 
of such an improvement is the storage of 2" integers in the range {—2", • • • , 2"} 
instead of 2” bits. 

Definition 1. Let f be a complex-valued function on Fij"- We call Numerical 
Normal Form (N.N.F.) of f, the following expression of f as a polynomial : 

f{xi,--- ,Xn)= Y. ^ A„x“, A„ec. (2) 

Proposition 1. Any complex-valued function admits a unique Numerical Nor- 
mal Form. 

Proof. The set of all complex-valued functions on ± 2 ") and the set of all poly- 
nomials over C in n variables such that every variable has degree at most 1 are 
vector-spaces of dimension 2" over C. The mapping which maps any such polyno- 
mial P{x\, • • • , Xn) to the complex- valued function f : a G ± 2 ” '— *■ F(ai, ■ ■ ■ , On) 
is linear. It is enough to show that this linear mapping is surjective. 

For every x G ± 2 ”) the polynomial expression of the Dirac function Sa{x) is 
equal to the product, when i ranges over {1, • • • , n} of Xi if Ui = 1, and of 1 — Xi 
if Qi = 0. By expanding this product, we deduce the existence of a polynomial 
representing Sa. A polynomial representing any complex- valued function is then 
obtained by using the equality f{x) = <5o(a;). 
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3 Relations between N.N.F. and other Representations 

3.1 N.N.F. and Truth Table 

According to the definition of the N.N.F., the truth table of / can be recovered 
from its N.N.F. by the formula /(a) = ^ ^ 2 "- Conversely, it is 

possible to derive an explicit formula giving the N.N.F. of the function by means 
of its truth table. 

Proposition 2. Let f be any complex-valued function on ■ For every u G 
F 2 ", the coefficient A„ of the monomial x“ in the N.N.F. of f is: 

Xu = (-!)-(“) ^ (_l)w(a)^(„) 

a^F2^ I a-^u 

where w(u) denotes the Hamming weight of the word u (i.e. the number of 
nonzero coordinates). 

Proof. Since we have f{x) = it is enough to prove that the 

coefficient of in the N.N.F. of the Dirac symbol 6a is (— if a ^ m 
and 0 otherwise. 

Denoting by supp(a) the set = 1, • • • ,n\ai = 1}, the value of Sa{x) is equal 
to (riiesupp(a) (^^^supp(a)(l “ ^i)) ' ^et la be the complement of supp(a) 

in {I,-- - ,n}, we have O^e/a 

result holds, denoting by u the word whose support equals the union of those of 

a and v. 

We give now a butterfly algorithm for computing the coefficients A„. The 
truth table of a function /, in n variables, is the concatenation of the truth 
tables of the two functions, in (n — 1) variables: 

fo : {X2,. . . ,Xn) 1 -^ /(0,X2, . . . ,Xn) 
fl : {X2,. .. ,Xn)l-^ f{^,X2,- ■ ■ ,Xn). 

Let fo(x) = /j.aX^ and fi(x) = be respectively the 

N.N.F. ’s of /o and /i. It is easy to deduce from proposition 2 that A(o,„ 2 ,,,, = 

M('U2,... ,Un) ^Ild A(^l .^2,... ,Un) ~ ^(u2,... .U^) h{u2.... ,Un)' 

This leads to the F.F.T.-like algorithm which computes the N.N.F. of any 
function with a complexity of n2"“^ elementary substractions: 

procedure NNF{f) 
for z ^ 0 to n — 1 do 
b^O; 
repeat 

for cc ^ 6 to & + 2* — 1 do 
f [x + 2*] ^ f [x + 2®] - f [x]; 

end for; 

6^ & + 2*+i; 

until 6=2”; 
end for; 
end procedure 
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In this algorithm, the function / is given by an array of 2” numbers (integer, 
real or complex), and the value of / at vector u = {u\, . . . , u„) is the fc-th entry 
of this array, where k = = ^ 12 "“^ + h 



3.2 N.N.F. and A.N.F. 



Given the N.N.F. of a Boolean function /, the A.N.F. of / is simply equal to its 
N.N.F. mod 2. Conversely, the coefficients of the N.N.F. can be recovered from 
those of the A.N.F. This uses the well-known Poincare’s formula which computes 
the xor of a set of binary values by using usual arithmetic operations. 

Lemma 1 ((Poincare formula)). Let oi, . . . be elements of the set {0, 1}. 
The following formula holds: 

m m 

^a* = ^(-2)'' ^ X! aii"-aik- 

z=l k—1 l<2i<---<ifc<m 

In the following theorem, the condition “{u^, . . . ,m*} | V • • • V = m” 
means that the words u^,. . . ,u^ are all distinct, in indefinite order, and that 
the union of their supports is equal to the support of the word u. 

Theorem 1. Let f{x) = be the N.N.F. of a Boolean function f 

on F 2 ”. Let Uu = A„ mod 2 be the coefficient of in its A.N.F. Then, one can 
retrieve Xu from the ay ’s with the following relation: 

2” 

A„ = ^(-2)'"“^ ^ a„i---a„fc. (3) 

k—1 {-ul,... I 

V • • - Vu^ = u 

Proof. For every value of x, the value of /(x) is a xor of the 2” terms x“. By 
applying Poincare’s formula, and by noticing that for all vectors u and v, one 
has x'^x'" = we obtain: 



f(x) = 

,U^} 






= E Hi-p-'i: 

ueFaA, fc=l 



a„i X 



V---V-u^ = u 



The latter expression is the unique N.N.F. of / and the result holds. 

Remark: Denoting {u ^ , . . . , by G and identifying any binary word u with 
the corresponding monomial x“, we can interpret theorem 1 as follows: let F be 
the set of all the monomials with coefficient 1 in the A.N.F. of /; we consider all 
the non-empty subsets G of F and, for each of them, we denote by |G| the size 
of G, and by supp(G) the set of all the indices of those variables involved in G. 
Then relation (3) is equivalent to 



A.= ^ (-2)l«l-i. 

GCF; G^ 0 ; 
supp{G) = supp(u) 
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3.3 N.N.F. and Fourier Spectrum 

Let f{x) = be any Boolean (or complex-valued) function. For 

every word a € F 2 ”, we have: 

7(a) = ^ /(x)(-l)“-= ^ ^ A„xX-l)“"= E E 

X^F2^ X^F2'^ U^F2^ U^F2^ xGF2'^\u'<X 

Changing x into x = cc 0 (1, • • • , 1), we yield: 

/(«)= E E (-1)“'^= E E 

X^F2^ I X-^U X^F2^ I X-^U 

Since the set {x G F 2 " | x ^ tl} is a (n — w(r())-dimensional vector-space, and 
since its orthogonal is {a G F 2 " | a ^ u}, the sum X^eFa" \ x~iuF^)°' ^ is equal 
to if a ^ u and to 0 otherwise. Thus: 

7(a) = (-!)-(“) ^ (4) 

uGF2'^ I a-<u 



In particular: 



w(/) = 7 ( 0 )= E (5) 

ueFa" 

Remark: Thanks to relations (3) and (5), the weight of a Boolean function can 
be expressed by means of the coefficients of its A.N.F.: 

w(/)= ^ /'2"-(“)E(-2 )'=-i E 

uGF 2^\ k — 1 {ul,... ,-u^} I 

111 V- - Vt4^=tt 

This latter relation can be written w(/) = FiGaF- where F 

and G are defined as in the remark following Theorem 1 and where v{G) denotes 
the number of variables which are not involved in G. We obtain in this way the 
formula given in [10], page 124. We deduce, similarly: 

X?(a) = 2"<5o(a) 0 (-!)"'(“)+' E (6) 

uGF2^ I a-^u 



Conversely, we can express the N.N.F. coefficients of a Boolean function by 
means of its Fourier coefficients. According to Proposition 2, A„ is equal to 
(_l)w(«) I Using the inverse Fourier transform, we get: 



Au — 



(_l)w(lJ 



E E = ^E-E/(^E 



X^F2^ 



xGF 2^ 



aeF2^ 



2 ” 



2 " 
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Since the function a w(a) + a ■ x mod 2 is linear on the vector-space {a € 
F2" I a ^ u}, the sum X^aeFa" | ^ nonzero if and only if this 

linear function is null on this vector-space. Thus XoeFa" | 
equal to if u ^ x, and to 0 otherwise. Finally: 

A„ = 2-"(-2r(“) ^ fix). (7) 

X^F2'^ I U^X 



Notice that, according to relations 6 and 7, the N.N.F. degree of a function is 
equal to the maximum weight of an element belonging to the Fourier transform 
support. 

4 Characterization of the N.N.F. of Boolean Functions 
among those of Integer- Valued Functions; Examples of 
such N.N.F. 

4.1 Characterization of Boolean Functions 

From condition y G {0, 1} = y, applied to all the values of /, we get: 

Propositions. The polynomial XueFa” G Z (or R or C) is the 

N.N.F. of a Boolean function if and only if 

VM G F2”, Xu = ^ ^ Xy Xyl . 

I u — vVv' 



Since this condition has to be satisfied by the 2" vectors of it has high 
computational complexity. But there is a simpler condition: an integer- valued 
function / is Boolean if and only if XxeF2" P(x) = ExeF2» f{x). Thus: 

Proposition 4. The polynomial X«6F2”"^“^”> Xu & Z is the N.N.F. of a 
Boolean function if and only if 

^ A„A„-= ^ 2”-'"(“U„. 

u^F 2 ^ I u—vWv' 



4.2 N.N.F. of AfRne Functions 

Let fix) = a ■ X (B s he any affine function (a S G {0, 1}). Using relation 

(3), we yield: 

( (-l)s(-2)'^(“)-i if M ^ a and M yf 0 
A„ = < e if u = 0 

I 0 otherwise. 
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4.3 N.N.F. of Quadratic Functions 

It can be shown that A„ is equal to 0 or to ±1 if t6 = 0, and to 0 or to ±2*, 
with < i < w(u) — 1 otherwise. Conversely, this condition implies that / 

is quadratic, since we have a„ = A„ mod 2 = 0 for every word of weight greater 
than 2. 

II is well known that there exist [f J +1 orbits of the set of quadratic functions 
under the actions of composition with any affine isomorphism and addition of 
any affine function (cf. [7]). It is possible to read directly on the N.N.F. which 
orbit contains a given function. 

4.4 N.N.F. of Symmetric Functions 

Let r G {0, • • • , n} and let / be the Boolean function whose support is the set of 
all the words of weight r in F 2 ". According to Proposition 2, the coefficient of 
u € F 2 ” in the N.N.F. of / is A„ = . Any symmetric function 

(i.e. any function f{xi , • • • ,Xn) invariant under permutation of the variables) is 
equal to a sum of functions of this form. 

5 Properties Deduced from the N.N.F. 

5.1 A Characterization of Perfect Nonlinear Functions 

We shall need first to show a property of the N.N.F. concerning the dual / 
of a perfect nonlinear function /, defined on F 2 " by j^{a) = 2'2(yj)(a). Us- 

ing relation (6) and equality / = — we have /(a) = \ — 2^~^5o{a) + 
(_l)w(a) ^ 2f-'"(“)Au. Changing u into u in this latter relation, we 

obtain the N.N.F. of / by expanding the following relation: 

^ n n 

i=l uGF2‘^ 1 

( 8 ) 

This implies that, for every u yf 0, m yf (1, • • • ,1), the coefficient of in the 
N.N.F. of / is divisible by 2"'(“)-v. 

Proposition 5. Let f{x) = N.N.F. if a Boolean function 

f on F 2 ”. Then f is perfect nonlinear if and only if it satisfies: 

1. for every u such that ^ < w(u) < n, the coefficient A„ is divisible by 
^(i.".i) is congruent with rnod 2^ . 

Proof. According to Lemma 1 of [1], / is perfect nonlinear if and only if, for 
every a G F 2 ”, f(a) = 2^~^ mod 2^. Thus, according to relation (4), conditions 
1. and 2. are sufficient for a Boolean function / to be perfect nonlinear. 

Conversely, assume that / is perfect nonlinear. The observatX>n above on 
the N.N.F. of the dual of a perfect nonlinear function, applied to / (whose dual 
is /) shows that condition 1 is necessary. Condition 2 is also necessary since 
/(I, • • • , 1) = (— l)"A(i^... ^ 1 ) (from relation 4). 
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5.2 Divisibility Properties of the N.N.F. Coefficients 

We first show that, in expression (3), some terms are null, depending on the 
degree of / and of the number of maximal degree monomials in its A.N.F. of /. 

Proposition 6. Let f be a Boolean function of algebraic degree d; let r be the 
number of monomials of degree d in the A.N.F. of f. Then, for every u G F 2 ”, 
the coefficient A„ of in its N.N.F. is equal to: 

2 " 

Xu = ^ (-2)''“^ ^ (9) 

Proof. Let {u^, . . . u^} be a set of k distinct vectors of F 2 ". If w(u^ V • • • V u^) > 
kd, then there exists an index j such that w(m-^ ) > d. Thus, for k < i.e. for 
k < — 1, every term of sum (3) has a null factor a„3 with w(rt^) > d. 

Moreover, if w(m^ V • • • V > rd + {k — r){d — 1) = k{d — 1) + r, then 
there exists an index j such that w(m^) > d. Thus, for k < i.e. for 

k < ~ 6’'^6ry term of sum (3) has a null factor a„j with w(u^) > d. 



Corollary 1. Under the same hypothesis as in Proposition 6, the coefficient Xu 
of in its N.N.F. is a multiple of2^ and of2^ 1“^. 

Me Eliece theorem on cyclic codes and Ax Theorem imply the following result 
on Boolean functions (cf. [7], page 447): 

Proposition 7. Let f be a Boolean function of degree d, then the weight of f 
is a multiple o/2r'^l“^. 

Corollary 1 and relation (5) give a new proof of this result. Indeed, ac- 
cording to Corollary 1 each term of the sum in relation (5) is multiple of 
2 "~w(“)+r*d And n — wl^u) + — 1 is minimum for w(u) = n. 

We know that the bound given by Proposition 7 is tight. However, Corollary 1 
improves upon it for the functions which have few monomials of highest degree 
in their A.N.F.: 

Proposition 8. Let f be a Boolean function of degree d. Assume that the num- 
ber of monomials of degree d in its A.N.F. is r < Then > § o,nd the 

weight of f is a multiple o/2r<*-i'l“^. 



Proof. According to Corollary 1, each term of the sum in relation (5) is a multiple 
of 2”-w(“)+r^l-i and of We have n-w(u)-k [^1 -1 > 

n — w(u) -I- [ — 1 if and only if w(m) < rd and it is a simple matter to 



check that max(n — w(u) -|- — 1, n — w('u) -I- — 1) is minimum for 

w('u) = n and the result holds. 
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6 Conclusion 

Since integer-valued functions are characterized by an integer-valued N.N.F. and 
thanks to Proposition 4, the N.N.F. representation allows more easily to con- 
struct a general Boolean function with prescribed combinatorial properties such 
as bentness or high nonlinearity. It was not possible before as Boolean functions 
are not easily characterized with the Fourier transform, and combinatorial prop- 
erties are not directly ensured from the truth table or A.N.F. representation. 
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Abstract. The lattice is an important lattice because of its covering 
properties in low dimensions. Conway and Sloane [3] appear to have 
been the first to consider the problem of computing the nearest lattice 
point in A^. They developed and later improved [4] an algorithm which 
is able to compute a nearest point in O(u^) arithmetic steps. In this 
paper, a new algorithm is developed which is able to compute a nearest 
point in 0(n log n) steps. 



1 Introduction 

The study of point lattices is of great importance in several areas of number 
theory, particularly the studies of quadratic forms, the geometry of numbers 
and simultaneous Diophantine approximation, and also to the practical engi- 
neering problems of quantisation and channel coding. They are also important 
in studying the sphere packing problem and the kissing number problem [5] . 
Let us now define what is meant by the term ‘point lattice’. 

Definition 1. Consider a set B — {6i, 62, • • ■ j bn} of linearly independent points 
in IR'", m ^ n. The set 



A = {aibi + a2b2 + • • ■ + ttnbn \ oi, 02 , . . . , a„ G 

is a (point) lattice of rank n in and B is a BASIS of 17. 

The lattice that is studied in this article, known as ‘A* ’ following the no- 
tation of Conway and Sloane or sometimes known as ‘Voronoi’s principal 
lattice of the first type’, is remarkable because of its covering properties in low 
dimensions [5]. The author’s interest in the lattice arises from his work in the 
engineering problem of pulse train deinterleaving [1,2]. The lattice A* reduces 
to the hexagonal lattice when n = 2 and to the body-centred cubic lattice when 
n = 3. This is illustrated in Fig. 1. 

The computational problem of finding a nearest lattice point to a given point 
is the particular problem of interest here. VAN Emde Boas [7] showed that 

* This work was supported by the Australian Research Council under Grant S499721. 
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Fig. 1. Examples of the lattice A* for n = 2 (the hexagonal lattice) and n = 3 
(the body-centred cubic lattice). 



the problem is iVP-complete under certain conditions when the lattice itself, 
or rather a basis thereof, is considered as an additional input parameter. For 
specific lattices, the problem is considerably easier. The problem of computing 
the nearest lattice point in A* was first studied by CONWAY and Sloane [3]. 
By decomposing the lattice A* into the finite superposition of translations of its 
dual lattice A„, the authors discovered an algorithm for computing the nearest 
lattice point to a given point in O(n^logn) arithmetic operations. Later [4], 
they were able to improve the execution time of the algorithm to O(n^) steps. 

In this paper, a new algorithm is presented which is able to compute a near- 
est lattice point in A* to a given point in O(nlogn) arithmetic steps. After 
discussing some mathematical preliminaries in Sect. 2, we investigate in Sect. 3 
the so-called Voronoi regions and relevant sets of vectors in lattices and, par- 
ticularly, in A* . A Voronoi region surrounds each lattice point, and any point 
within this region is closer to the lattice point at the region’s centre than to any 
other lattice point. The relevant vectors are those lattice vectors which can be 
used to define the boundary of the Voronoi region. Having found a relevant set 
of vectors for A* , we find it convenient in Sect. 4 to define three notions of prox- 
imity of a given point to a lattice point. These notions allow us to decompose 
the algorithm into three smaller algorithms, discussed in turn in Sects. 5-7, each 
of which calculates a successively closer lattice point to the given point until at 
last a nearest lattice point has been found. 




106 I. Vaughan and L. Clarkson 



2 Mathematical Preliminaries 

In this section, we will briefly discuss the notation and terminology used through- 
out the sequel. Firstly, we note that we will, on occasion and without notice, use 
elements of as if they were column vectors (elements of We will use 

the notation P™ to refer to the set of (elementwise) permutations of the point 
(1, 2, . . . , m) G We also define the parameterised set cr(-) as follows. 

Definition 2. We define the parameterised set cr(-) such that if x G P™ and 
s G cr{x) C P™ then 



Csi ^ Xs2 









Xs 



m 



Finally, it is necessary to define the lattice A* . 

Definition 3. The lattice A* is the lattice of rank n in whose basis vectors 

are any n columns of the matrix 





^ n — 1 • • 




1 rj. 

B^I 11^ = 


— 1 n • • 


• -1 


n+\ 


U-1" 


• n j 



( 1 ) 



Remark 1. The matrix S is a symmetric proiection matrix, which is to say that 
B = and B^ = B. 



Remark 2. Notice that 



• bj 



and, more generally, that 



n 

n + 1 



n + l 



if i = j, 



otherwise. 



X ■ bj = Xj 



(2) 



(3) 



for any x G 

Remark 3. The lattice A* is constituted of vectors in a vector space whose di- 
mension, n -|- 1, exceeds the rank of the lattice, n. 

Remark In Figs. I and 2, Gl* is represented with respect to the hyperplane 
BP”+i on which the lattice vectors lie. That is, representative coordinates of 
the lattice points in P" are generated by premultiplication by where Q G 
p(”+l)x” is a matrix whose columns form an orthonormal basis of SP"^^. In 
the case of Ag, this is followed by an orthographic projection into P^. 
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3 Relevant Sets of Lattice Vectors 

Consider a lattice A in M™ and a norm H-H. Around the origin, there is a region 
V consisting of points which are closer to the origin than to any other lattice 
point. That is, we define V as the set 

V = {a; e M"; ||a:|| ^ ||a; — t;|| Vv e A} . 

Such a region is called the VORONOI cell of the origin. The whole of DT” can 
be tessellated by translating these cells by the lattice vectors. Therefore, an 
algorithm to find the nearest lattice point to a given point can be interpreted as 
an algorithm to find the particular translation of the Voronoi cell to which the 
given point belongs. 

Consider the Voronoi cell of the origin. It is a convex poly tope with faces 
that lie in the hyperplanes at the midway points along the lines connecting the 
origin to nearby lattice points. The set of vectors which define the faces are the 
(Voronoi-)relevant vectors of the lattice. 

Definition 4. For a lattice A in IR™, a set of non-zero lattice vectors TZ is 
RELEVANT if for every x e which satisfies 

lla^ll < i|la:-r|| (4) 

for every r G TZ, this same inequality is also satisfied for every r G A. 

In other words, a relevant set contains lattice vectors such that, for any point 
which is closer to the origin than to any of the vectors in the set, that point is 
closer to the origin than to any other lattice point. A straightforward implication 
of the definition is that a lattice point t; G A is closest to some point u G IR™ 
if (4) is satisfied with x = u — v for each lattice point r in a relevant set. 

For the Euclidean norm, we note that (4) is equivalent to 

x-x^(x — r)-(x — r) = x- x — 2x-r-hr-r 

and, after cancellation and rearrangement of terms, it is equivalent to 

x-r < i||rf . (5) 

Let us now turn our attention to the lattice A* with the Euclidean norm. 
Henceforth, we will use the notation H-H exclusively to denote the Euclidean 
norm of its argument. 

Consider the set 

sGP”+\l<m<n| . (6) 

We will show that this set is relevant to A*. 
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b\ + 62 




Fig. 2. Examples of the Voronoi regions of the lattice A* for n = 2 (a hexagon) 
and n = 3 (a truncated octahedron). 



Lemma 1. Every element v S A* can he expressed in the form 

n j 
j=i i=i 



where the Cj € 2Z, Cj ^ 0, j = 1,2, ... ,n and s G 



Proof. We can express any v with respect to the basis vectors bi , ^2 , . . . , 6„ in 
the obvious way, which is to say that 

n 

i=i 

where the Oj G 2Z, j = Clearly, we can extend the summation to 

n + 1 terms, so that 



n+1 



i=i 



with the coefficient a„+i = 0. 

Now, choose s G cr(—a), so that ^ ^ ^ Then, using the 

fact that 
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we find that 

n 

v = '^{as, -as^+,)bs, 
i=i 

n j 
j=i i=i 

Therefore, with cj = Usj — Os^+i, we satisfy (7). □ 



Lemma 2. Ifv,wG A* can he expressed as 



v = Y,bs 






2=1 2=1 

where s G 1 ^ p ^ n and 1 ^ ^ n then v ■ w > 0. 

Proof. Suppose p q. Then, 



V ■ w 






i=l j=l 




■b. 




p{n+l- q) ^ ^ 
n + 1 



Since we can use the labels v and w arbitrarily, we conclude that v ■ w > 0, 
regardless of whether p ^ q. □ 



Theorem 1. The set TZ as defined in (6) is relevant to A* . 

Proof. We prove the theorem statement by working directly from the definition 
of a relevant set of lattice vectors in Definition 4. Consider some x G 
which satisfies (5) for every r G TZ. Consider also some v G A* . From Lemma 1, 
we can write 



n 



i=i 
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where each Cj ^ 0, j = 1,2, ... ,n and 

3 

'^3 = H 

where s G Thus, Wj G TZ. 

Now, our assumption that (5) is satisfied implies that 

n 

X ■ V = CjX ■ Wj 
1=1 
n 

3 = 1 

Furthermore, 

n n 
i=l j=l 

n n i—1 

= 11^1 11^ + 2 XI XI 

1=1 i=2 1=1 

n 

^X*^! 11^1 11^ (9) 

1=1 

because lUi • wj > 0 from Lemma 2. From (8) and (9) and bearing in mind that 
Cj ^ 0, we have 

x-v^^ ||vf , 

as required. □ 



4 Degrees of Proximity of a Lattice Point 

In order to explain the workings of subsequent algorithms, we find it convenient 
to define degrees of proximity of a lattice point in 4* with respect to another 
point in Firstly, we note the following fact. 

Theorem 2. Consider some x G IfvGA^ is a closest point to Bx with 

respect to the Euclidean norm, where B is defined in (1), then it is also a closest 
point to X. 

Proof. The proof follows from a simple decomposition of the vector x into or- 
thogonal components. Consider a lattice point w G A*. We have 

\\x — = ||(a; — Bx) + {Bx — tn)|l^ . 
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Now, w can be expressed as w = Bz, where 2 ; G Thus, 

\\x - w\\^ = \\{x - Bx) + B{x - z)\\^ . 

Consider the inner product {x — Bx) ■ {Bx — w). We have 

{x — Bx) ■ {Bx — w) = x^ B{x — z) — x^ B^ B{x — z) = 0 , 
since B^ B = B. Clearly, then, 

||a: — t(?||^ = ||a; — BxW"^ + \\Bx — w\\'^ 

^ \\x - BxW"^ + \\Bx - = \\x - , 



as required. □ 

From this theorem, we see that it is sufficient to consider only the nearest 
lattice points to points on the plane We now define three degrees of 

proximity to a lattice point in A* with respect to points in 

Definition 5. Consider a lattice point v G A* and a point y G B1R"~^^. Let 
S = y — V. The lattice point v is a-CLOSE to y if 



I ^ 1 

for all i,j = 1, 2, . . . , n + 1. The lattice point v is /3-CLOSE to y if 

|<5*| < I 

for all i = 1,2, ... ,n + 1 and it is 7 -CLOSE if 

m{n + 1 — m) 

2(n + l) 

for all m= 1,2, ... ,n and s G 




(10) 



( 11 ) 



(12) 



Theorem 3. If v G A* is 'j-close to a point y G then v is a nearest 

lattice point to y. 

Proof. From (3), we find that, with 5 — y — v, 

m 

6si = 5 ■ w = —d ■ w' 

i=l 

where s G 

m n+1 

w — bs^ and w' = —w = . 

i—1 
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Clearly, both w and w' are elements of TZ as defined in ( 6 ). Furthermore, it is 
easily confirmed that 



w 



= 



m{n + 1 — m) 
n+l 



Therefore, satisfaction of the inequality (12) for any particular value of s is 
equivalent to simultaneous satisfaction of 

d ■ w ^ ^ l|w^l!^ smd (5 • in' < 5 (13) 



for the corresponding vector w. 

Satisfaction of (12) for all values of s S is then equivalent to satis- 

faction of (13) for all w G TZ. However, from Theorem 1, we know that TZ is 
relevant to and so the origin is the closest lattice point to S. By extension, 
V is the closest lattice point to y when v is y-close to y. □ 



We now describe three algorithms. The first algorithm takes as input a point 
X £ and, in 0{n) arithmetic steps, outputs a point 2 : G such 

that Bz is a-close to Bx. The second algorithm and third algorithms both 
take a point x G and a point 2 G as inputs and, after O(nlogn) 

operations, output a new value of 2 . For the second algorithm, the inputs 2 and 
X are assumed to be such that Bz and Bx are a-close and, on output, they 
are /3-close. For the third algorithm, the input 2 and x are assumed to be such 
that Bz and Bx are /3-close and, on output, they are 7 -close. From Theorem 3, 
we see that the application of these algorithms in series results in an algorithm 
which finds a nearest lattice point in A* to an input point in O(nlogn) steps. 



5 The First Algorithm 

We will now set out to prove that the algorithm described below, which takes as 
its input some x G outputs a vector 2 G such that the lattice point 

Bz is a-close to Bx. In the following, we will use the notation [•] to denote a 
function which returns a nearest integer to its real argument. 

Algorithm 1. 

j begin 

2 for i := 1 m n -I- 1 do 

3 2i := [xi']; 

4 od; 

5 output(z); 

6 end. 



Proposition 1. If x G is input to Algorithm 1 then a point z G is 

output such that Bz is a-close to Bx. 
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Proof. If we write y = Bx and v = Bz then we have 

n+l 

Vj - Vj = {Xj - [Xj] ) - — — {Xi - [x*D . 

^ “r -L . - 

1=1 

Now, — i ^ Xi — \_Xi\ < 5 so we conclude that \yj — vj\ ^ 1. Furthermore, 

(Vj - Vj) - iVk - Vk) = {xj - [xjD - {xk - [xk^) 
and so \{yj — Vj) — {yk — Vk)\ ^ 1. Thus, v is a-close to y, as required. □ 

6 The Second Algorithm 

In this section, we will show that Algorithm 2, listed below, given inputs x G 
IR"”'’^ and 2 ; € such that Bz is a-close to Bx, outputs a new value for 

2 ;, say z', such that Bz' is /3-close to Bx. Before setting out the algorithm, we 
define the use of two functions, project and sortindices. The function project 
takes an input, say, u G IR""'’^ and returns Bu. Note that this function can be 
calculated in 0{n) steps. We see this by observing that the i**' element of Bu is 
Ui~ p, where p is the average of the elements of u. The function sortindices takes 
as input a vector, say, u G IR”"'’^ and returns an element s of cr{u), which is to 
say that it returns a vector s such that Usi ^ u.s 2 ^ ^ "^sn+i- This function 

requires 0(n log n) arithmetic operations [6]. 



Algorithm 2. 



1 


beein 




2 


y := project(x); 




3 


V := project(z); 




4 


S :=y-v; 




5 


s := sortindices{6); 




6 


m := 0; 




7 


while 


- i ^ 


8 


TO := m -I- 1; 




9 




1; 


10 


qd; 




11 


TO := n -I- 1; 




12 


while - 


i dq 


13 




H 


TO := TO — 1; 




15 


qd; 




16 


output(z); 




17 


end. 
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Lemma 3. Let v he a lattice point of and let y he a point in If v 

is a-close to y then, with S = y — v, there exists a permutation s G and 

an integer 0 ^ m ^ n + 1 such that 

IJm ~ 1 ^ <5si ^ ^ ^ (5sm ^ Vm .N 



where 



'Hm — 



m 

n + 1 



1 

2 



(15) 



Proof. We will prove the lemma by contradiction. We begin by choosing a value 
of s G cr(S), for if we did not, we could not possibly satisfy (14). 

However, suppose there is no value of m which will satisfy (14), even with 
s e cr(5). With the value m = 0orm = n+l, this implies that there is some 
index i such that |5i| > Note that, because v is a-close to y, there does not 
exist a pair of indices i and j such that 



< — 2 and Sj > ^ 

because this would contradict (10) of Definition 5. So assume, without loss of 
generality, that there exists some index i such that Si < — This implies that 
5si < and that 

Now, we will show that, even though (14) is not satisfied for m = 0 or 
m = n + 1, there must exist some value of m with 1 ^ m ^ n such that the 
slightly weaker condition 



^Sl < Ss2 






^ ^ Vm ^ ^Sm + 1 ^ ^Sm + 2 ^ ^ <5, 



(16) 



is satisfied. If this were not the case then, by applying an inductive argument from 
our assumption that < 770 , we would have Ss^ < yi-i for each i = 1,2, ... ,n. 
However, this would imply that 



n+1 






^n + l 



i=l 



i=l 



1 

n+1 




whereas this sum should be zero since S G B1R"~^^. 

Therefore, although we assume that (14) is not satisfied for any value of m 
and, as a result, we further assume that < — 1, we have concluded that at 
least (16) must hold for some m with 1 ^ m ^ n. However, this implies that 
^si ^ ■'7m ^ <^s„+i and, because v is a-close to y, we have — <5si | < 1 so 

that r]m — l ^ ^ ?7m ^ ^ '7m + l- Finally, this implies (14), contradicting 

our initial assumption. □ 



Theorem 4. Let v he a lattice point of A*, y he a point in and let 

6 = v—w. Ifv is a-close to y then there exists some s G P"+^ and 0 ^ m ^ n-l-1 
such that (14) and (15) is satisfied and, moreover, v — w is (3-close to y, where 

m 

^ = ■ 



( 17 ) 
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Proof. It can be readily checked that 



1 - 



Ws 



n + 1 
m 



n + 1 



if i ^ TO, 
otherwise. 



(18) 



With reference to the inequalities of (14), we then find that the values of the 
elements of the difference e = y — v + w — 6 + w satisfy the inequality \ei\ ^ | 
for alH = 1, 2, . . . , n + 1. Therefore, v — w is /3-close to y. □ 



Proposition 2. If x G IR"^^ and z G are input to Algorithm 2 and Bz 

is a-close to Bx then, after 0(n log n) arithmetic operations, a new value of z 
is output such that Bz is j3-close to Bx. 

Proof. When line 6 of Algorithm 2 is reached, we have computed the values y = 
Bx, V = Bz, S = y — V and a permutation s G cr{d). As discussed previously, 
the execution time to this point is dominated by the sorting operation, which 
requires 0 (n log n) arithmetic operations to complete. 

First of all, suppose |<3i| < ^ for all i = 1, 2, . . . , n -I- 1. That is, we consider 
the case where v is already /3-close to y. Clearly, neither of the while loops 
on lines 7-10 or 12-15 will be entered and the value of 2 ; will be unchanged. 
Therefore, in this case, the output 2 ; is identical to the input 2 : and so the 
output Bz is /3-close to Bx, as required. 

Suppose there exists some index i such that |5i| > That is, either Jsi < — ^ 
01' <^sn+i > These inequalities cannot both be satisfied at once because v is 
a-close to y. This implies that exactly one of the while loops on lines 7-10 
and 12-15 will be entered. 

Suppose the while loop which is entered is the first one, on lines 7-10. This 
is equivalent to the supposition that <5s, < ~^ and The while loop 

continues while < rjm, incrementing to at the end of each loop. Lemma 3 

guarantees us that the loop will terminate with 1 ^ to ^ n, since (14) is not 
satisfied for to = 0 or to = n-l- 1. Thus, the loop requires 0{n) steps to complete. 
When it terminates, (16) will be satisfied with 1 ^ to ^ n. As discussed in the 
proof of Lemma 3, the fact that ^ Pm ^ implies that (14) is satisfied 
for that value of to. Furthermore, by the end of the loop, the new value of 2 — 
and let us denote this new value as 2 ' — differs from the input value by —1 in 
a given element whenever its index is in the set {si, S 2 , . . . , Sm}. That is, 

m 

Bz' = Bz-'^bs^ . (19) 

i=l 

From Theorem 4, Bz' is then /3-close to Bx. 

Finally, suppose instead that it is the second while loop on lines 12-15 
that is entered, which implies that <5s„+i > 5 and The argument 

follows along similar lines to those employed when the first while loop is entered. 
We are assured that the loop will terminate with 1 ^ to ^ n, thus requiring 
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0{n) operations to complete. In turn, this implies that (16) will be satisfied 
with 1 ^ m ^ n and that, as a result, (14) will also be satisfied. When the 
loop terminates, the new value of z which has been computed, say z', differs 
from the input value by 1 in a given element whenever its index is in the set 
^m+ 2 ; • ■ ■ 5 That is, 

n+1 

Bz' = Bz+ bs, . 

However, substituting the identity 

n+1 n 

2=1 

we find we again have (19) and so, from Theorem 4, Bz' is +close to Bx. □ 

7 The Third Algorithm 

Having presented algorithms which compute an a-close lattice point in to an 
input point in and, given an a-close point, produce one which is +close, 

we now present an algorithm that can produce a 7 -close point from one which 
is +close. 

Algorithm 3. 

1 begin 

2 y := project(x); 

3 V := project(z); 

4 S:=y-v; 

5 s := sortindices{6); 

6 TO := 0; t := 0; r := 0; 

7 for z := 1 m n do 

f — f 4 - f ) 1 ^-1 " ■ 

3 t •— t + Osi + „+l 2(n-|-l)’ 

9 if t < r then t :=t\ m := i; fi 

10 od : 

11 for z := 1 to TO do 

12 Zsi •■= Zsi - 1; 

13 od; 

14 output(z); 

15 end. 



Lemma 4. A point v G A* is 'y-close (a nearest lattice point) to y G 
if and only if 

m 

Y - Pi) ^ 0 
2=1 



( 20 ) 
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for all m = 1,2, . . . ,n + 1, where 5 — y — v , s G cr(S) and 

P= \ (-n,-n+2,...,n-2,n) . (21) 

2(n + 1) 

Proof. First of all, let us prove that (20) is a necessary condition for v to be 
7 -close to y. Noting that 






m(ji + 1 — m) 
2{n + l) 



we see that if (12) of Definition 5 is satisfied then (20) must be satisfied also. 

Having proved necessity, let us now prove sufficiency. For any r G and 

l^m^n-|-l, we use the fact that the sum of any m elements of S must be 
greater than or equal to the sum of the m smallest elements to obtain 






i=l 



m(ji + 1 — m) 
2{n + l) 



Similarly, using the fact that the sum of any m elements of S must be less than 
or equal to the sum of the m largest elements, we find that 



i=l 



n+1 

E 



n+1— m 



n+1 — m 



E ^ E = “ E ^ “ E = 






i=l 



m{n + 1 — m) 
2{n+l) 



Thus, we find that if (20) is satisfied then, for any r G and 1 ^ m ^ n, 

m(n -hi — to) 

^ 2(n-hl) 

and, since this is simply (12) of Definition 5, v is y-close to y. □ 




Lemma 5. Consider a lattice point v G A* and a point y G Let S = 

y — V and s G cr(S). Suppose v is [3-close to y. If v' = v — w where 

m 

W = y2^s, (22) 

i=l 

for some 1 ^ to ^ n then, with 6' = y — v' , there exists some s' G cr{S') for 
which 

(■Sl , ^2, ■ ■ ■ , Sn+i) = (Sm-l-1 ; Sm+2> ■ • ■ ; Sn-|-1 : Si j ^2) ■ • ■ ; Sm) ■ (23) 

That is, there exists some s' G cr(d') which is the elementwise left rotation of s 
by TO positions. 
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Proof. Since the form of w in (22) is identical to that in (17), it follows from (18) 
that 6g. ^ whenever 1 ^ i ^ j ^ m and whenever m+l^i^j^n+1. 
Furthermore, since v is /3-close to y, we know that i5si ^ | and 

Hence, 



SL 



^Si + 1 



m 1 m 

^ 

n -I- 1 2 n -I- 1 



and 



j- ,. = « “ 



s„+i sn +1 n-|-1^2 n + 1 



This implies that 



5' ^ S' S' ^S'g ^S'g ^...^S'g 

^m + 1 ^m + 2 '^n+l ^1 ^2 



which in turn implies (23), as required. 



□ 



Corollary 1. Consider a lattice point v € A* and a point y € Let 

5 — y — V , s G cr(S) and 

€ = (Sg, -pi,Sg^-p2,..., - p„+i) (24) 

where p is defined in (21). Suppose v is (3-close to y. If v' = v — w where w 
is given in (22) then, with S' = y — v' , there exists some s' G cr(S') such that, 
when 



e' = (,5(/^ -pi,S'g,^-p 2 ,..., S'g,^_^^ - pn+i) 
e! is the elementwise left rotation of e by m positions. 
Proof. From Lemma 5, we have, when l^i^n-|-l — m, 

and so 



m 

n -I- 1 



r/ „ r 2(m -h i - 1) - n ^ 

~ Pi ~ ^Si+rn +1) ~ ^s«+»n Pi+m — ^i+m ■ 

Similarly, when n-|-l — m<i^n-|-l, we have 

+ 1 - 

which implies that 

2(i -\-m — n — 2) — n 



n -I- 1 



4 = K' -Pi = Ss 



— — n — 1 Pi + m—n—1 



2{n+l) 

and hence e' is the elementwise left rotation of e by m positions. 
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Lemma 6. Consider a point e S If 



m = are min 



E- 



and e' is the elementwise left rotation of e hy m positions then 






i=l 



for all 1 ^ j ^ n + 1. 

Proof. Ifl^j^n+1 — m then 

3 



m+j /ra+j \ / m \ 

E^*= E «* = (^E ■ 






On the other hand, ifn + 1 — m<j^n+l then we find that 



j /j+m-n-l ^ 

E^'.= E 



n+l \ /j-'+m— n— 1 

E ^0 = 



(25) 



(26) 



E " E 






and so (26) is true for any 1 ^ j ^ n + 1. 



Theorem 5. Consider a lattice point v G A* and a point y G Let 

5 = y — V, s G cr{5) and let e be as defined in (24). Suppose v is /3-close to y. 
If m is defined according to (25) then v' = v — w is '-/-close to y, where w is 
defined according to (22). 

Proof. The proof follows directly from application of Lemma 6, Corollary 1 and 
Lemma 4. □ 



Proposition 3. If x £ and z G are input to Algorithm 3 and Bz 

is (3-close to Bx then, after O(nlogn) arithmetic operations, a new value of z 
is output such that Bz is ^ -close to Bx. 

Proof. The proposition is proved merely by observing that Algorithm 3 expresses 
in a programmatic way the construction of a y-close point given in Theorem 5. 
Specifically, the first for loop on lines 7-10 calculates the value of m as defined 
in (25), albeit that the value of m is restricted to lie in the range 0 ^ m ^ n 
rather than l^m^n+1 — a variation that has no mathematical implications. 
The second for loop on lines 11-13 constructs the new value of 2 ;, say z' , such 
that v' = Bz' = V — w, where w is defined as in (22). 

In terms of the amount of calculation required, both for loops require only 
0{n) arithmetic steps to complete and so the total execution time is dominated 
by the execution of the sortindices procedure, which requires 0(n log n) opera- 
tions. □ 
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Abstract. Binary and quaternary sequences with perfect periodic au- 
tocorrelation, and perfect nonlinear p™-ary sequences, are both shown 
to equate to orthogonal coboundaries — the simplest class of orthogonal 
cocycles. We consider doubly-indexed sequences defined by cocycles. We 
give a new construction — a generalised multiplication — of orthogonal 
cocycles and show it gives perfect nonlinear sequences for parameters 
where 1-dimensional PN sequences cannot exist. 



1 Introduction 

Sequences with desirable correlation or distribution properties are much sought 
for use in signal transmission, optical imaging and encryption. Much effort has 
been expended on constructing and classifying sequences which are optimal or 
nearly optimal with respect to some measure of merit. The sequence is frequently 
regarded as a mapping from an index set G (such as the modular group Z^, or 
the finite field Fpm) to a sequence set C (typically taking binary, complex or 
p™-ary values). The measure of merit can be off-peak correlation, or uniform 
distribution amongst sequence values, or nonlinearity. 

Here we observe that sequences with perfect periodic autocorrelation and per- 
fect nonlinear sequences both equate to orthogonal coboundaries — the simplest 
class of a set of functions called orthogonal cocycles which are 2-dimensional on 
G. This allows us to generalise from 1-dimensional to 2-dimensional mappings 
and hence consider sequences defined by cocycles, which have improved perfor- 
mance over the optimal 1-dimensional functions against these figures of merit. 

Cocycles are mappings -0 : G x G ^ C, where G and G are finite groups 
with G abelian, which satisfy a particular quasi-associative equation (1). They 
arise naturally in the topology of surfaces, in quantum dynamics, in projective 
representation theory, and in combinatorial design theory, as well as in the co- 
homology theory of groups. 

Increasingly, links have been found between the optimal 1-dimensional se- 
quences and difference sets [2,3,15]. In the 2-dimensional case, there is a precise 
link: we know each orthogonal cocycle is equivalent to a semiregular central 
relative difference set, and conversely [17]. 

Semiregular relative difference sets are of interest precisely because of their 
good Hamming correlation properties for FHMA. For example, perfect binary 
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arrays are equivalent to splitting abelian relative (4u^, 2, 4 m^, 2M^)-difference sets, 
and Jedwab’s generalised perfect binary arrays are equivalent to abelian relative 
(4t, 2, 4t, 2t)-difference sets [13, Theorem 3.2] . Kumar’s ideal matrices for FHMA 
communications systems are the two-dimensional characteristic functions of rel- 
ative {v,v,v, l)-difference sets in [16], and hence are rare. 

In this paper we firstly give a new construction (Theorem 2) of orthogo- 
nal cocycles. Secondly, we show that orthogonal coboundaries are the same as 
1-dimensional sequences with perfect periodic autocorrelation and with perfect 
nonlinearity. We generalise these measures to 2-dimensional sequences of cocycle 
values and prove that the distribution of values of a cocyclic sequence is invariant 
(Theorem 3) under the relevant equivalence operations on the cocycle. We prove 
that there are many 2-dimensional sequences with perfect nonlinearity (Exam- 
ple 5) for parameters where 1-dimensional PN functions cannot exist, and give 
constructions for Almost-PN and Z\-nonlinear cocyclic sequences (Corollary 3) . 

Earlier, the author and Perera [10] introduced a very general description of 
cocyclic codes in order to demonstrate the previously unrecognised (and well- 
hidden) presence of cocycles in several code construction techniques. Category I 
of these codes comprised those constructed from a cocyclic generalised Hadamard 
matrix. Many standard constructions of (generalised) Hadamard matrices are in 
fact cocyclic, and in [9] many well-known codes are shown to be Category I. 
Since (generalised) Hadamard matrices determine nonlinear codes which meet 
the Plotkin bound, and orthogonal cocycles are equivalent to cocyclic gener- 
alised Hadamard matrices [17], the new orthogonal cocycles identified here also 
determine optimal nonlinear codes. 

2 Cocycles and Generalised Hadamard Matrices 

Throughout, G will be a finite group of order v and C will be a finite abelian 
group of order w. A (2-dimensional) cocycle is a mapping ip : G x G C 
satisfying the cocycle equation 

ip{g,h) 'ip{gh,k) = 'ip{g,hk) ip{h,k), \/g,h,keG. (1) 

This implies ip{g,l) = ip{l,h) = ^/>(1,1), S G, so we follow standard 

usage and consider only normalised cocycles, for which ip{l, 1) = 1. Each cocycle 
ip determines a central extension 1 ^ C ^ ^ G ^ 1 of C by G, in which the 

group of order vw consists of the set of ordered pairs {{a, g) : a G G, g G G} 
with multiplication 

(a,g){b,h) = {abtp{g,h),gh), (2) 

and the image G x {1} of G lies in the centre of E^. 

A cocycle is naturally displayed as a G- cocyclic matrix; that is, a square 
matrix whose rows and columns are indexed by the elements of G under some 
fixed ordering, and whose entry in position {g,h) is ip^g^h). If ip is symmetric 
{ip{g,h) = ip{h,g) always), is a symmetric matrix. We write 

= [^{g,b)]g,h(^G- 



(3) 
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Definition 1. A cocycle is a coboundary d(f> if it is derived from a set mapping 
4> : G ^ C having ^(1) = 1 by the formula d4>{g, h) = Two 

cocycles ilr £md ip' are cohomologous if there exists a coboundary dp such that 
Ip' = Ip ■ dp. 



Example 1. li G = {g : g" = \) = 'Ey and a is an element of order n in C, 
then ip{g',g^) = a'G for all 0 < t, j < v — 1, is a symmetric cocycle and Mjj; 
is a Vandermonde matrix, li n = v, M.,p is the matrix of the Discrete Fourier 
Transform (or the Mattson-Solomon polynomial) . 



Example 2. li G = Up and C = {±1} = Z 2 then ^(u,v) = (—1)“ '', for all 
u, V e G, is a symmetric cocycle and is the Sylvester Hadamard matrix of 
order 2”. 



Example 3. If V is a finite-dimensional vector space over a field F and ip is & 
bilinear form on V then ip : (V, -I-) x (V, -I-) ^ F is a cocycle. 

We will term an additively-written abelian group “distributive” if it also 
carries a distributive multiplication. That is, a distributive group is an abelian 
group G = (G, -I-) (with identity 0) having a second binary operation • such that 
ifg, h,k G G, g ■ {h + k) = g ■ h + g ■ k and {h + k) ■ g = h ■ g + k ■ g. (We write 
g ■ h = gh when the context is clear.) 

Such distributive groups include finite rings and presemifields (where (G, •) 
is a quasigroup; that is, for any nonzero g,h € G there are unique solutions 
in G to gx = h and to yg = h.) In particular, Galois fields, Galois rings and 
semifields (such as Dickson commutative semifields) are distributive groups. If 
a distributive group is a finite commutative semifield but not a field, the only 
field property which does not hold is associativity of multiplication, and (see [1, 
VI. 8. 4], [14, p.269]), G is an elementary abelian group of order > 16, with 
a > 3. 

Example 4 . Let G be a distributive group. The mapping pi : G x G ^ G defined 
by pi{g, h) = gh, \/g, h G G, is a, normalised cocycle, the multiplication cocycle. 
Since G is abelian, pi is symmetric is abelian (G, •) is commutative. 

The last four examples are instances of a general class of cocycles: any 
mapping f : G x G G which is homomorphic in each coordinate (that is, 
f(gh, k) = f{g, k)f{h, k) and f{g, hk) = f{g, h)f{g, k) always) is a normalised 
cocycle. In particular we can generalise the multiplication cocycle. 

Example 5. Let G be a distributive group. Let X, p : G ^ G he mappings, and 
define px^p : G x G ^ G by px,p{g,h) = \{g)p{h), \/g,h G G. If A and p are 
homomorphisms then p\^p is a normalised cocycle. 

This link is emphasised by the following partial converse. 
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Corollary 1. Let C he a distributive group with a multiplicative unity 1, and 
let X,p : G ^ C be mappings. If p,\^p is a normalised cocycle, 1 G ImX (resp. 
Imp) and X (resp. p) is a homomorphism then p (resp. X) is a homomorphism. 

Proof. Since px^p is a cocycle, X{g)p{h) + X{gh)p{k) = X{g)p{hk) + X{h)p{k). If 
A is a homomorphism, the result follows on setting A(g) = 1. □ 

To link cocycles with generalised Hadamard matrices and perfect sequences 
we will be interested in the frequency with which a cocycle takes each value in 
the target abelian group C. 

Definition 2. Let : G x G ^ C he a cocycle and for each g G G,a G C define 
N{g, a) = \ {h G G : 'if>{g, h) = a}\. The set of frequencies {N{g, a) : g G G,a G G} 
is the distribution of ip. 

A generalised Hadamard matrix GH('u;, v/w) over IT is a u x u matrix H with 
entries from the group IT of order w|u such that the list of quotients hijhf^, 
1 < j < "y, contains each element of IT exactly v/w times. With h*j = hj/^, the 
defining matrix equation over ZIT is 

HH* =vh + iv/w){Y,^)(Jv-Iv). (4) 

uew 

Any GH('u;, v/w) is equivalent to a normalised GH('u;, v/ru) with its first row and 
column consisting entirely of the unit element of IT. A Hadamard matrix is a 
GH(2,u/2). All known examples of GH(w, v/ru) require w to be a prime power. 

When the GH(w,v/r(;) H is also cocyclic; that is, IT = C is abelian, w|u 
and H = for some cocycle ip, this is equivalent to imposing a combinatorial 
condition we call orthogonality, on the cocycle. 

Definition 3. The cocycle ip : G x G ^ C is orthogonal when w|u if the non- 
initial rows of M.p are uniformly distributed over the elements of C; that is if, 
for each g ^ I G G, 



N{g,a)=v/w, Vo G C (5) 

or equivalently, if in ZG, for each g ^ 1 G G, h) = v/w (X^aeC ®)- 

For instance, in Example 1 if n\v then ip is orthogonal (with entries in G' = 
(a)) when n = p and v = p™, for p a prime. The cocycles of Example 2 are 
orthogonal. The multiplication cocycle for any finite presemifield is orthogonal. 

The designs corresponding to normalised GH(r<;, u/ic) are divisible designs. 
When the GH matrix is also cocyclic, these designs have been completely char- 
acterised. Furthermore, they correspond to relative difference sets. 

A relative {v, w, k, X)-difference set [4] in a finite group E of order vw relative 
to a normal subgroup N of order w, is a fc-element subset R of E such that the 
multiset of quotients didf^ of distinct elements di, d 2 of R contains each element 
of E\N exactly A times, and contains no elements of N. (The ordinary (v, k, A)- 
difference sets correspond to the case N = {!}.) There is always a short exact 
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sequence 1 ^ N ^ E ^ E/N 1 . We will be concerned only with relative 
difference sets having k = v and therefore also k = wX. A relative difference set 
with the latter property is termed semiregular. 

Theorem 1. (Equivalence Theorem) [17, Lemma 2.7, Theorem 4.3] Let wju 

and let Mti, be a G-cocyclic matrix with entries in C . Then 
M.^ is a GH{w,v/w) if and only if 

the design!).^ developed from {{l,g),g G G} C is a divisible (v,w,v,v/w)- 
design, class regular with respect to C , if and only if 

Rip = {{^,g),g G G} C is a relative (v,w,v,v/w)- difference set relative 
to the central subgroup G x {1}, if and only if 
the cocycle if is orthogonal. 

3 New Classes of Orthogonal Cocycles 

In this section we give a new construction of orthogonal cocycles, and show how 
to obtain new orthogonal cocycles from old. 

In [8] generalisations of the multiplication cocycle for a finite field were given, 
together with conditions under which they were orthogonal. 

Example 6. [8, Lemma 3.11] If F is a finite field and G = (F, +), then the power 
cocycles pi : GxG —>■ G defined by Pi{g, h) = gP h, 0 < i < a—1, are orthogonal. 
(Here po = p.) 

Udaya [19] has pointed out that the Frobenius automorphism in this example 
can be replaced by an arbitrary linearised permutation polynomial. This prompts 
the following generalisation, giving a very large class of orthogonal cocycles which 
have not previously been recognised. 

Lemma 1. Let G be a distributive group which is a presemifield and let X, p : 
G ^ G be homomorphisms. Lf X (resp. p) is an automorphism, then p\,p ' 
GxG ^ G is an orthogonal cocycle if and only if p (resp. X) is an automorphism. 

Proof. Here v = w, so v/w = 1. From Example 5, p\^p is a cocycle. By symmetry 
we may assume A is an automorphism, and we only need to show that p\^p is 
orthogonal if and only if p is one-to-one. This is clear because, given any X{g) yf 0 
and k G G, there is a unique solution h G G to the equation X{g)p{h) = fc if and 
only if \{hGG \ p\.p{g, h) = k}\ = 1. □ 

The composition of an (orthogonal) cocycle GxG C with an epimorphism 
G ^ G' of abelian groups is an (orthogonal) cocycle. For example the field 
multiplication cocycle projects via the relative trace function. 

Example 1. Let F = F^m and K = ¥q he finite fields and let T = TrpjK he the 
relative trace function. Then Top: (F, -|-)^ — > {K, -I-) is an orthogonal cocycle. 



A more general version of this example is this consequence of Lemma 1 . 
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Theorem 2. If G is a distributive group which is a presemifield, X, p : G ^ G 
are automorphisms and ^ : G ^ G is an epimorphism then jopxp : GxG ^ C 
is an orthogonal cocycle. 

The composition of an automorphism of G acting diagonally on G x G with 
an (orthogonal) cocycle G x G ^ G is an (orthogonal) cocycle. For example, if p 
is an automorphism then p,\^po{p~^ x p“^) = p,\op-^.i, so that p,\^p and /r^op-yi 
lie in the same orbit under diagonal Aut(G)-action. When further composed with 
an automorphism of G, we obtain an (Aut(G) x Aut(G))- action on the abelian 
group of cocycles which partitions it into (Aut(G) x Aut(G))-orbits which either 
consist entirely of orthogonal cocycles or contain no orthogonal cocycles. 
Denote by 0{ijj) the orbit of ip under the action 

ip^'^'^\g,h) = -i{ip{9{g),9(h))), 7 G Aut(G), 6» G Aut(G). (6) 

In [7] it was shown that these orbits may be collected into bundles under a 
further G-action termed shift equivalence. 

Definition 4. Cocycles ip,ip : G x G ^ C are shift- equivalent, written ip ‘P 
via k, if there exists k G G such that ip = ip- dpk, where Pk{g) = p{k, g), g G G, 
or, equivalently, if there exists k G G such that ip{g,h) = ip{kg,h) ip{k,h)~^. 
The bundle B{ip) of ip is B{ip) = (J 

Shift equivalence preserves orthogonality, so these bundles consist wholly 
of orthogonal cocycles or of non-orthogonal cocycles. In fact, each orthogonal 
bundle corresponds uniquely to an equivalence class of relative difference sets, 
and vice versa (the Bundle Isomorphism Theorem [7]). 

Finally we show that the distribution of a cocycle is an invariant of the bundle 
containing that cocycle. It is easy to show it is an invariant of the (Aut(G) x 
Aut(G)) orbit of the cocycle. Write '^j,^Qip{,g,h) = N{g,a)a. li k G G 

then (by (1)) 

ip{kg, h)ip{k, h)'~^ = ip{kgk~^ .k, h)ip{k, h)~^ = (7) 

h&G heG 

ip{kgk~^, k)~^ip{kgk~^, kh) = N{kgk~^,a){ad), d = ip{kgk~^, k)~^. 

hGG a&C 

Hence the distribution is invariant under shift equivalence. 

Theorem 3. The distribution 'D{B{ip)) of a cocycle ip is an invariant of its 
bundle B{ip). 

4 Sequences from Cocycles 

For binary sequences (p : G {if} indexed by an abelian group G, the 
(periodic) autocorrelation function A : G ^ Z is defined [15] to be A{g) = 
J2h&G'^(9 + and the sequence is a perfect binary array if A{g) = 0 




Sequences from Cocycles 127 



for every 5 yf 0. It is well-known that a perfect binary array corresponds to a 
Menon-Hadamard difference set in G and to a splitting relative (4t6^, 2, 4 m^, 2m^)- 
difference set in Z2 x G, and vice versa. 

Perfect binary arrays are investigated as one response to the lack of examples 
of perfect binary sequences (for which G = in the definition above). A recent 
survey of progress in the search for generalisations of perfect binary sequences, 
and their links to difference sets and divisible difference sets, appears in [15]. 

More generally, the autocorrelation function for a sequence (f> : G G where 
G and G are multiplicatively written finite groups with G abelian and <()(1) = 1, 
is A : G ^ ZG where A{g) = is a perfect array if 

A{g) = 0 for every g yf 1. 

We see that for g yf 1, 

= 0 4y> ^ (f){gh)(l>{h)-^(l>{g)-^ = 0 (8) 

h&G heG 

^^d(j){g,h) = 0, (9) 

heG 

so that a perfect binary array is also equivalent to an orthogonal coboundary 
with G = {±1}. A perfect quaternary array has G = {±1, ±t} and in this case the 
orthogonal coboundaries determine the subclass of balanced perfect quaternary 
arrays: those for which each value of G is taken equally often. 

Clearly the coboundary in (9) may be replaced by a cocycle. 

Definition 5. A cocyclic perfect array is a cocycle ip : G x G ^ C, where G is 
a finite group, G is a finite abelian group, such that in ZG, 

Vgy^l, J2'^(9,h) = 0. (10) 

heG 

It is balanced if each element of G appears equally often in the summation. 

Whenever = 0 in ZG, such as in the cases of interest mentioned 

above, balanced cocyclic perfect arrays are identical to orthogonal cocycles, and 
vice versa. 

Hughes [12] has investigated the relationships between autocorrelation func- 
tions and orthogonal cocycles in detail. He has, in a sense, reversed the pro- 
cess above, deriving from each orthogonal cocycle ip a 1-dimensional sequence 
Ejp G with uniformly distributed autocorrelation function for values h of 
away from a specified forbidden subgroup. This sequence is the characteristic 
function of a central relative (v, ic, f f /w)-difference set. 

The autocorrelation is one measure of the behaviour of a sequence (p relative 
to its translates. Another approach is to look at the spread, as h ranges over 
G, of the values <p{g + h)(p{h)~^, and to take the flatness of the distribution as 
the measure of merit. Sequences for which this distribution is as flat as possible 
are good potential S-box functions because of their resistance to differential 
cryptanalysis. What is sought is sequences (p : Fpm ^ F^m such that 

A = max \ {x € Fpm : <p[a -I- x) — (p{x) = &}[ 



( 11 ) 
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is as small as possible. A function (j) with Z\ = 1 is perfect nonlinear (PN) and 
with A = 2, almost perfect nonlinear (APN). It is known that for p = 2, Z\ = 2 
is the best possible, while for p > 3, Z\ = 1 is achievable. For p = 2, extensive 
current research focuses on the identification and classification of APN functions 
and their relationship to maximally nonlinear sequences (eg. [3]). It is also clear 
that further links to cyclic difference sets are being uncovered [2,3]. 

In this section, we propose a cocyclic construction of nonlinear functions with 
desirable distribution properties. 

Note that in (11), Z\ = 1 if and only if 

Va yf 0, 6 G Fpm, |{x G Fpm : 4>{a + x) — (j){x) = &}| = 1 if and only if 
Va yf 0, c G Fpm , |{x G Fpm : (f>{a + x) — </>(x) — 4>{a) = c}| = 1 if and only if 
d(j) : G X G ^ G, G = (Fpm , +) is an orthogonal coboundary. 

Corollary 2. (p : Fpm ^ Fpm is a PN function if and only if dp is an or- 
thogonal coboundary if and only if {{p{g),g) : g G Fpm} is a splitting relative 
{p™ ,p^ ,p^ ,1)- difference set in Fpm x Fpm relative to Fpm x {0}. 

In [8] the author shows that the constructions of abelian splitting relative 
(p™,p’", p™, l)-difference sets known to her [14,18] for odd p are isomorphic to 
those defined by the multiplication cocycle in some commutative semifield. In 
these cases, p is (necessarily) a coboundary [8, Theorem 3.8]. By Ganley’s result 
[5], no abelian splitting relative (2™, 2™, 2™, l)-difference set exists, so Z\ = 2 is 
indeed the best possible for p = 2, in contrast to the situation for odd primes. 

However in the same article, the author shows that the field multiplication 
cocycle defines the known [6,18] abelian (non-splitting) relative (2™, 2™, 2™, 1)- 
difference sets, and thus, if we relax the requirement that a nonlinear function be 
1-dimensional, a wealth of constructions of sequences with excellent distribution 
properties exists. 

Definition 6. Let p : GxG ^ G he a cocycle and define A = maxg^i^c N{g, c). 
We say ip is a A-nonlinear cocycle (briefly: ip is perfect nonlinear (PN) if Z\ = 1 
and almost perfect nonlinear (APN) if Z\ = 2). 

Combining this definition with the results of the previous section we obtain: 

Corollary 3. If ip : G x G ^ G is an orthogonal cocycle it is v /w -nonlinear, 
and if ^ : G ^ G' is any monomorphism, ^ o ip is v / w-nonlinear. 

Theorem 4. If ip is A-nonlinear, so is every cocycle in its bundle B(ip). 

We illustrate the above ideas in full detail for the small case G = C = Z|. 

Example 8. If G = C = Z|, each cocycle ip is identified by a quadruple (x, y, z, t), 
where x, y,z,t G G. With indexing {0 = 00, 10, 01, 11}, the cocyclic matrix M,p 
is 

'o 0 0 o' 

0 X z X -I- z 

0 z -It y y -\- z -\-t 
0x-\-z-\-ty-\-zx-\-y-\-t 
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There are 16 bundles of cocycles [7] and on computing their distributions from 
a representative cocycle ip we obtain the following table, in which bundles are 
grouped according to the cohomology class of The first group (of 2 bundles) 
consists of the coboundaries. The first three groups consist of symmetric cocy- 
cles (and is abelian) and the second three groups consist of non-symmetric 
cocycles (and is nonabelian). The distributions are read off in order from 
the rows of the corresponding and abbreviated, so that for example, the 
third row of M^a,o,c,o) is OjC, 0,c, which is represented as 2^ and the fourth is 
0, a -I- c, c, a, which is represented as 1^. (Here distinct variables refer to distinct 
nonzero elements of C.) 





if 


v{Bm 


Ep 


A 


Comments 


1.1 


( 0 , 0 , 0 , 0 ) 


{4, 4, 4, 4} 


Z| 


r 


Trivial cocycle 


1.2 


( 0 , 0 , c, 0 ) 


{4,22,22,22} 


Z4 


2 


APN = best 1 — dim. function 


2.1 


(a, 0 , 0 , 0 ) 


{4,22,4,22} 




4 




2.2 


(a, a, 0 , 0 ) 


{4,22,22,22} 


Z| X Z 4 


2 


APN 


2.3 


(a, 0 ,c, 0 ) 


{4,14,22,14} 


Z| X Z 4 


2 


APN, p, for F 2 - 1 - MF 2 


3.1 


(a, 6 , 0 , 0 ) 


{4,22,22,14} 


zl 


2 


APN, p for F 2 X F 2 


3.2 


(a, 6 ,c, 0 ) 


{4,14,14,14} 


Z 2 


1 


PN, p for F 4 


4.1 


( 0 , 0 , 0 ,d) 


{4,4,22,22} 


Wa 2 X 7^)4 


r 




4.2 


( 0 , 0 , c, d) 


{4,22,22,14} 


^2 X Z ^4 


2 


APN 


5.1 


( 0 , 6 , 0 ,d) 


{4,4,14,14} 


E5 [7] 


4 




5.2 


( 0 , 6 , c, c) 


{4,22,22,22} 


E 5 


2 


APN 


5.3 


( 0 , 6 , 6 , d) 


{4,22,14,22} 


E 5 


2 


APN 


6.1 


{a, a, 0 , a) 


{4,22,22,22} 


^4 \K Z 4 


2 


APN 


6.2 


{a, a, 0 , d) 


{4,22,14,14} 


^4 \K Z 4 


2 


APN 


6.3 


{a, a, a, d) 


{4,22,14,22} 


^4 K Z 4 


2 


APN 


6.4 


{a, a, c, a) 


{4,14,14,14} 


^4 \K Z 4 


1 


PN, power cocycle p\ for F 4 



The 2-dimensional APN functions in this table are all new sequences with dis- 
tribution as good as or flatter than the best 1-dimensional APN functions (which 
must equate to coboundaries, ie those in bundle 1.2 above). The 2-dimensional 
balanced APN functions involving a single variable (bundles 1.2, 2.2 and 6.1 
above) can also be derived as the monomorphic images of orthogonal (hence 
APN) cocycles G x G ^ Z 2 using Corollary 3. 

The 2-dimensional PN functions in this table (bundles 3.2 and 6.4 above) are 
new sequences whose distributions achieve the theoretical limit not attainable 
by any 1-dimensional functions. 

Theorem 5. Let G be the additive group of a presemifield and let X, p : G ^ G 
be any automorphisms. Then /iv,p has distribution {A^(0, 0) = v, N{0, c) = 0, c yf 
0,7V((/,c) = 1, (/ yf 0}, so that Z\ = 1 and p,\^p is a PN eocycle. 

There are many presemifield structures with underlying additive group G = 
Z™ (for example any finite field or Dickson commutative semifield) . In this case 
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there are (p™ — l)(p'" — p) . . . (p™ — p^~^) automorphisms of G, each of which 
can be expressed as a unique linearised permutation polynomial of the finite field 
structure on G. We hereby obtain new sequences whose distributions achieve the 
theoretical limit. 
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Abstract. The difference Q 2 — (I 2 for a binary linear [n, fc, d] code C is 
studied. Here d .2 is the smallest size of the support of a 2-dimensional 
subcode of C aud Q 2 is the smallest size of the support of a 2-dimensional 
subcode of C which coutains a codeword of weight d. For codes of dimeu- 
siou 4, the maximal value of 32—^2 is determiued. For general dimensions, 
upper aud lower bouuds on the maximal value are giveu. 



1 Introduction 

Ozarow and Wyner [6] suggested one application of linear codes to cryptology, 
namely to the wire-tap channel of type II. For this channel, an adversary is 
assumed to be able to tap s bits (of his choice) of n bits transmitted. The goal 
for the sender is to encode k bits of information into n transmitted bits in such 
a way that the adversary gets as little information as possible. 

One of their schemes was to use the dual of an [n, k] binary linear code C. 
The code has 2^ cosets, each representing a binary fc-tuple. If the sender wants 
to transmit k bits of information to the receiver, he selects a random vector in 
the corresponding coset. The channel is assumed to be noiseless, so the receiver 
can determine the corresponding coset of the received vector. It is assumed the 
adversary has full knowledge of the code, but not of the random selection of a 
vector in a coset. 

In his studies of this scheme, Wei [7] introduced a set of parameters of a binary 
code which he called the generalized Hamming weights. The same parameters 
had also been studied previously in another context [4] and has since proved 
important also in other contexts. 

For any code D, let x{D), the support of D, be the number of positions where 
not all the codewords of D are zero. Further, the support weight of D is |y(I?)|. 
For an [n, k] code C and any r, where 1 < r < fc, Wei defined 

dr{C) = min{|x(U)| | H is an (n,r) subcode of C}. 

In particular, the minimum distance of C is d\{C). The weight hierarchy of C is 
the set {dr{C) | 1 < r < fc}. 

For the Ozarow-Wyner scheme, it was shown by Wei [7] that the adversary 
can obtain r bits of information if and only if s > dr(C). 
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Cohen et al. [2], [3] considered the following variation of the problem. The 
adversary is greedy. He first reads d = di positions to obtain one symbol of 
information as soon as possible. He then reads a minimal number of further 
positions to get one additional symbol of information and so on. Let gr denote 
the minimal number of symbols he has to read to get r symbols of information in 
this way. Note that gi = di and gk = dk- We call the sequence (gi, g2, gs, ■ ■ ■ ,gk) 
the greedy weight hierarchy. In particular, g2 is the smallest support of a 2- 
dimensional subcode of C which contains a codeword of weight d. The cost to 
the adversary (in extra positions read) to get two symbols of information using 
this algorithm is g2~ d2- We consider here how large g2 — d2 can be for given n, 
k, and d. In general, we consider three main cases: 

Case I: There exist two codewords Ci, C2 G C, such that w(ci) = d and |x(ci) U 
x(c2)| = d2- Here and in the following wQ denotes the Hamming weight. I this 
case g2 = ^2 by definition and there is nothing more to be said. 

Case II: We do not have case I, but there exist three codewords Ci, C2, C3 G C, 
such that w(ci) = d, x(c2) U x(c3)| = ^2, and |x(ci) U x(c3)| = 32- 

Case HI: We do not have case I or II. Then we consider in detail the 4-dimensional 
subspace D generated by four codewords Ci , C2 , C3 , C4 G C such that two of them 
generate a subspace of support ^2 and the other two a subspace of support 32 
which contains a codeword of weight d. 



2 Case II Codes 



We first consider briefly [n, 3, d] codes. Let m3 = ms^n, d) be defined by 
m^{n,d) = max{^2(£*) — d2{D) | is an [n, 3, cf| code without zero-positions}. 
The exact value of ms^n, d) was determined in [1]: 



m3 



d- 

I 2n 

2 



n—d 
\ ^ 
-3d-3 



0 



n—d 

3 



if <rf< 

if [MJ <d< 
otherwise. 



( 1 ) 



For [n, 3, d] codes in general (possibly with zero-positions) which satisfy con- 
dition II, we get 32 — ^2 < m^{di,d). We have mz{dz,d) = 0 for d > ^ and 
in particular for d > ^. Further, mzid^^d) < ‘^d. 3 - 7 d ^ An- 7 d ^ ^ 4^3^ 
Also, ni3{d3,d) < | for all d. Since any [n,k,d\ code which satisfy condition II 
contains an [n,3,d] code which satisfy condition II, we get the following upper 
bound: 



d 

2 

4n-7d 

6 

0 



if 0< d< 

if ^ 
j ^ 4n 



32 — ^2 < 



(2) 
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3 Case III Codes 

First we consider codes without zero positions and translate the problem into 
geometric terms. 

Let G be a generator matrix for an [n, k] code C. For any x G GF{2)'^, m(x), 
the value of x, will denote the number of occurrences of x as a column in G. In 
[5] it was shown that there is a one-one correspondence between the subspaces 
C of dimension r and the subspaces of GF{2)^ of dimension (fc — r) such that if 
D corresponds to U, then ws{D) +X)xec/”^(^) ~ 

We may view the vectors as points in the projective space PG{k — 1,2). 
A value assignment is a function m : PG{k — 1,2) ^ N = {0,1,2,...}. For 
p G PG{k — 1,2) we call m{p) the value of p. A value assignment defines a 
generator matrix and a code (up to equivalence). We define the value of a subset 
S of PG{k — 1, 2) as follows: m{S) = 

Suppose that a value assignment m corresponding to an [n, 4, d] code is given. 
By definition, we have n — d = max{m(P) | P is a plane}. Let 

a = max{TO(^) | Hs a line}. 



f3 = max{m(l) | Hs a line in a plane of value n — d'\. 

Conditions III can then be stated as follows: if Hs a line of value a and P is a 
plane of value n — d, then all lines I' in P have value at most (3 and the lines in 
P meeting I have value less than /?. 

We get Q 2 — d 2 = a — (3. Hence, we want to maximize a — (3 for given n and 
d. We denote this maximum by m 4 = m. 4 {n, d). 




The space PG(3, 2) contains 15 points, 35 lines, and 15 planes. To be able to 
refer to these items in a simple way we find it convenient to introduce coordinates, 
that is, each point is given by a nonzero quadruple. To simplify the notation, 
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we refer to the point (C1C2C3C4) as point 8ci + 4c2 + 2ca + C4 and we write 
m(ciC2C3C4) = X8 ci+ 4 c 2+2 c 3 +c 4- For example, (0110) is named 6 and m(OllO) = 
xe- The space PG{3, 2) is illustrated in the figure above where all the points and 
some of the lines are included. 

Without loss of generality (renaming the points if necessary) we may assume 
that 



m{P*) = n — d, where P* = {1, 2, 3, 4, 5, 6, 7} 
= P, where = {1, 2, 3}, 

m{l*) = a, where r = {4, 8, 12}. 



The Range d < Inll 

Theorem 1. If d < then 7744(71, d) = 0. 



Theorem 2. If < d and 



n — d 
. 5 - 



< 



■3t7 — 8d — 7‘ 
5 



then 



(3) 



m,i{n, d) 



fid — 77 — 6 

5 



Note that condition (3) is true for d < 2n/7. 

Proof of Theorem 2: The three lines in P* through point 4 all have value at most 
/3 — 1 by condition III. Hence 



3(/3 — 1) > w({l, 4, 5}) + t 77({2, 4, fij) + 7T7({3, 4, 7}) = ti — d + 2x4, 



and so 



2x4 < 3(/3 — 1) — (t 7 — d). 

Since {1, 2, 3, 4} C P* we get P + X 4 < n — d. Hence, by (4) we get 
2x4 < 3 (t 7 — d — X4 — 1) — (t 7 — d) 
which implies 6x4 < 2 t 7 — 2d — 3 and so 

2t 7 — 2d — 3 



def 

X4 < U = 



From (4) we also get 



P> 



n — d+ 2x4 + 3 



(4) 



(5) 



3 



( 6 ) 
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We have a = X 4 + xs + X 12 < X 4 + d and so 



a — /3<X4 + d — 



n — d + 2 x 4 + 3 



= d- 



n — d — X 4 + 3 



<d- 
= d- 



n — d — u + 3 



n — d + 6 



= d- 



n-d - [(2n-2d-3)/5j +3" 



6d — n — 6 






If 6c? — n — 6 < 0, then a < ( 3 . This is impossible and so 1x14 = 0 in this case. 
This proves Theorem 1. If 6c? — n — 6 > 0, then a value assignment which reaches 
the upper bound is given as follows: 



= 0 for p ^ {1, 2, 3, 4, 7, 8 , 12}, xs = \d/2~\ , X 12 = [d/2\ , 



n — d 


Xi 


X 2 


3:3 


X7 


X 4 = u 


a 


/3 


5/i 


/i 1 




d 
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2p- 1 


d -\- — 1 


3/x + 1 


5p+ 1 


/i 1 




d 




1 


2p- 1 


d — 1 


3/x + 1 


5/X + 2 


/i 1 


fJ> 1 


d 




0 


2p 


d ‘ 2 ^ 


3/i -|- 2 


5/X + 3 


/i 1 


/i + 1 


d 




1 


2p 


d 2 fj, 


3/i -|- 2 


5/X + 4 


/X + 1 


/i + 1 


/X 1 


0 


2p + 1 


d 2 fj, “h 1 


3/x -|- 3 


Note that a — f3 = 


c? — p. — 


2 = 


6 d—n 

5 


in 


all cases. This construction 


is valid 



provided 



n — c? — 1 > m{P) = x\ + X 4 + x^ + x \2 = 



n — c? + 5 


2n — 2d — 3 


L 5 J + 


L 5 J 



where P is the plane {1, 4, 5, 8, 9, 12, 13}, and this is equivalent to condition (3). 



The Range 2n/7 ^ d < 4n/ll 
Theorem 3. If < d and 



n — 2 d \ 



n — 2 d+ 1 



> 



7d — 2 n + 6 

6 - 



then 



m 4 {n, d) = 



d-3 



7d — 2n + 8 

6 - 



( 7 ) 



Note that (7) is satisfied for d < 4n/ll. 

Proof of Theorem 3. There are three planes which contain I*. These planes must 
have value at most n — c? — 1 . Hence 



n + 2a = ^ m{P) < 3{n — d — 1) 

P 

l*CP 






136 Wende Chen and Torleiv Kl0ve 



and so 



We have 



2n — 3d — 3 



Xi = a — (ccg + X12) > a — d. 



Combining with (6) we get 

rn — d + 2(a — d) + 3"| a — n + 3d— 3i 

3 =[ 3 J 

[2-^J_„ + 3d-3 

- L 3 J “ L 2 J' 

This proves mi{n, d) < . 

We now give a value assignment which attains this bound in the range given 
in Theorem 3. 

a;p = 0 for p e {9, 10, 11, 13, 14, 15}, 
n — 2d+2| n — 2d+l| n — 2d| 

= L — 3 — J ’ = I — 3 — J ’ ’ 



7d — 2n + 4 



7d — 2n + 6 
J L 6 J L 



Id — 2?2 -|- 8 



2n - 5d - 3 



vd^ 

J, =^8= - , 



The value assignment is valid if Xp > 0 for all p, that is, n — 2d > 0 and 
7d — 2n + 4 > 0; and xi + X2 + X3 > xi + xe + X7 which is the same as (7). 



The Range d > 4n/ll 

Theorem 4. If n — d= 1 (mod 7) and 



2n — 3d— 3| r3n — 3d+3 



+ 1 ) 



or n — d^l (mod 7) and 



2n — 3d — 3 ^ |"3n — 3d+3 

- 2 J - 7 



then m 4 (n, d) = 0 . 
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Theorem 5. If n — d= 1 (mod 7) and 



n — d — 2 


d ^ 


2n — 3d — 3 


> 


~3n — 3d+3' 


L 7 J 




7 




L 2 J 





( 12 ) 



then 



mi{n, d) 



2n — 3d — 3 


'3n — 3d+ 3' 


L 2 J 


7 



If n — d^ 1 (mod 7) and 



then 



n — d — 2 


d ^ 


2n — 3d — 3 


> 


"3n — 3d+3' 


L 7 J 




7 




L 2 J 





(13) 



mi{n, d) 



2n — 3d — 3 


'3n — 3d + 3' 


L 2 J 


7 



Consider first the lines in P* . Three of the lines (those containing the point 
4) have value at most (3—1, the remaining at most (3. Since each point in P* is 
on three of the lines, we get 



3m{P*) = 3(n - d) < 7/3 - 3 



(14) 



and so 



P> 



"3n — 3d + 3" 
7 



(15) 



If n — d = 7/^+ 1, then | | = 3/i+ 1. However, /3 = 3/t+ 1 is not possible 

in this case. Suppose the contrary. We then have two possibilities: i) 3 lines of 
value 3/t+ 1 and 4 of value 3/t or ii) 4 lines of value 3/t+ 1, 2 of value and 1 of 
value 3/t — 1. In case i) we may assume without loss of generality that the 3 lines 
are {1, 2, 3}, {1, 6, 7} and {2, 5, 7}. Hence we get the following set of equations: 



Xi + X 2 + Xs = 3^ + 1 , Xi + Xq + X 7 = 3fJ, + 1 , X 2 + X 5 + X 7 = 3^ + 1 , 
X 3 + xe + X 7 = 3fi, xi + Xi + X 3 = 3fi, X 2 + X 4 + xq = 3fi, 

X 3 + X 4 + X 7 = 3/r. 

However, this system of equations has the unique solution: 



1 1 

Xi= X2 = X7 = ^l+ XA=ti-~, X3=X3=Xe=fX, 



which is impossible since the Xp are integers. Similarly, in case ii) we get a unique 
solution where some of the Xp are non-integers. Hence, we have the following 
relation: 



if n — d = 1 



(mod 7), then (3 > 



'3n — 3d + 3' 

7 



-f 1. 



(16) 
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Combining (14), (15) and (8), Theorem 4 follows. We also see that 1714 is 
upper bounded by the expressions in Theorem 5. 

For the points in the plane P* we give the following values: 



Xl = 



n — d+7 



X2 = 



n — d + 6 
7 - 



X3 = 



n — d + 3 

7 



X4 = 



-d-2 



X5 = 



n — d + 2 



7 



Xq = 



n — d+1 



xr = 



n — d + 4: 

. f 






Note that a = 
Next, we define 



2n—3d—3 



. Hence (9) gives the first inequality in (12) and (13). 



X8 = 



2n—3d—3 



J - X4 



X 12 = 



2n—3d—3 



J - X4 



For the remaining points, we give the following values: 



x„ = 



d + X 4 — [ 



2n—3d—3 I 



where £p is given by the following table, where v = n — d (mod 7) and rj = d 
(mod 4): 




The proof that this construction has the stated properties is similar to the 
proofs of the previous constructions. The simplest, but most tedious way, is to 
consider 28 cases depending on the various values of { 17 , rj). We skip the details. 



Remarks. 1) The theorems above determine m 4 (n, d) for all n and d. This is easy 
to check, e.g. by looking at the 18 possible combinations of residues of n modulo 
3 and d modulo 6 that indeed the conditions of at least one theorem above are 
satisfied. In some case, two theorems apply. For example both Theorems 2 and 
3 can be used to show that m4(95,27) = 12, and both Theorems 3 and 5 show 
that 7714 ( 95 , 34 ) = 15. 

2) For [?7, 4, d] codes without zero-positions in general (that is, condition III 
may not be satisfied) can be treated in the same way. We only omit the condition 
“if I is a line of value a and P is a plane of value n — d, then all lines in P meeting 
I have value less than /3”. We do not give further details. 
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4 General Dimensions 

For [n, 4, d] codes in general (possibly with zero-positions) which satisfy condition 
III, we get 52 — ^2 < TTH{di,d). We have 7714 (^ 4 , d) = 0 for d > ^ and in 
particular for d > Further, mi{di,d) < < Sn-i5d ^ ^ Also, 

[m 4 (d 4 ,d) < I for all d. In particular, (2) is satisfied also for case III codes. 
Hence, we have the following theorem. 

Theorem 6 . For any [n, fc,d] code we have 

[i ^/o<d<f, 

92 -d 2 <{ Iff<d<f, 

[0 d>4p. 

We next give a construction of a code without zero-positions for general k. 
It is again given in the projective geometry representation, that is, the points 
are represented by binary fc-tuples. 

For given n, k, and d, let X he a, {k — l)-dimensional projective (binary) 
space; it contains 2^ — 1 points. Let 

H = {(xi,X 2 , ... ,Xfc) \xk = 0}, 

which is a fixed (fc — 2)-dimensional subspace of X and 

P = {(xi,X 2 , ... ,Xk) I Xk-1 =Xk = 0} 

which is a fixed (fc — 3)-dimensional subspace of H. Further, let 

Q = {(xi,X2, ... ,Xk) I Xk-2 = Xk-1 = 0}, 

which is a fixed (fc — 3)-dimensional subspace of X which is not contained in H. 
We want an assignment m such that 

1. m{H) = n — d, 

2. m{H') < n — d for all subspaces H' G X oi dimension k — 2, 

3. m{P') < m{P) for all subspaces P' of dimension fc — 3 of a subspace PI' of 
dimension k — 2 and value n — d, 

4. m{P') < m{Q) for all subspaces P' C Aof dimension k — 3. 

Then d 2 = m{P) and 92 = m{Q). 

We assign the following values: 

— All points (a;i,X 2 , . . . ,Xk-i-,Xk) of even weight are assigned the value 0. 

— The 2^~'^ — 1 points (xi, X 2 , . . . , Xk-i^Xk) 7 ^ (1, 0, . . . ,0, 0) of odd weight in 
H are assigned the value [(d -I- l)/2'^“^]. 

— To (1,0, .. . ,0,0) we assign the value which makes m{H) = n — d, namely 
n-d-( 2 '=- 2 -l)[(d + l)/2'=-3l. 

— Finally, (0,0, .. . ,0, 1) is assigned the value d and the remaining points of 
odd weight are assigned the value 0 . 
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For the construction to work we also assume that ,0,0) > [(d + 

l)/2fc-3-|^ that is n > d + In particular, this is satisfied if 

n>d+ 2'=-2(d + 2^-^) /2^-^ = M+ 2^-"^. 

We get 

m{P) = to( 1,0,... ,0) + (2'=-3 - l)[(d+ 1)/2'=-3] = n - d - 2'=-3[(d + 1)/2'=-3] , 
m(g) = d + m(l, 0, . . . , 0) + (2^-^ - 1) \{d + l)/2'=-3] = n - 3 • 2’^~^ \{d + l)/2'=-3] . 

Hence 

m(Q) - m(P) = d - 2'=-^ [(d + l)/2^~^] > d/2 - 2'="2. 

Consider a subspace P' of H of dimension k — 3. We have 

m{P) — m{P') = m{P \ P') — m{P' \ P). 

Both P\P' and P'\P contain 2^~^ points of odd weight. All the points in P'\P 
have value [(d+ 1)/2^“^] and all the points in P\ P' have the same value except 
possibly (1,0,... , 0, 0) which have at least this value. Hence m(P) > m(P'). 

If P' yf P is a space of dimension k — 2, then 

m(P') <d + m(P'nP) <d + m(P) = n-2'=-3[(d+l)/2'=-3] < n - d. 

Hence g2~d2 = m{Q)—m{P). We summarize this result in the following theorem. 

Theorem 7. If n > d + 2*“^ [(^ + 1) , then there exists an [n, k, d] code 
C without zero-positions such that 

g2-d2 = d-2'=-4[(d+l)/2'=-3]. 

Asymptotics 

Cohen et al. [3] studied the asymptotics of (52 — d2)ln. They used the notations 
di = 62 = — , 72 = — , and showed that for almost all codes we have 

H{52) + 82 log2 3 = 2P(di) and P(72) + 72P(di/72) + di = 2P(di), 

where P is the binary entropy function. Solving these equations for 82 and 72 
in terms of di, we see that the corresponding 72 — 82 increases slowly with di, 
reaching a maximum « 0.002016618 for di « 0.262549 and then it decreases to 
zero for di = 1/2. 

In contrast, we note that Theorems 6 and 7 show that (for fixed fc), for the 
codes with maximum g2 — 82 we have asymptotically 

I2 — 82 = for all di < 1/3 

(for di > 1/3, we only get an upper bound on the maximum). 
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Abstract. A code is called t- identifying if the sets Bt(x) n C are all 
nonempty and different. Constructions of 1-identifying codes and lower 
bounds on the minimum cardinality of a 1-identifying code of length n 
are given. 



1 Introduction and Basics 

Denote by F 2 the binary alphabet {0, 1}. The (Hamming) distance d(x,y) be- 
tween any two vectors x, y G W 2 is the number of coordinates in which they 
differ from each other. The integer d(x, 0), where 0 denotes the all-zero vector 
(0, 0, . . . , 0), is called the weight of x. If x, y G F 2 and d(x, y) < t, then we say 
that X t-covers y (and vice versa). Any nonempty subset C of of F 2 is called 
a binary code of length n. The covering radius of a code is the smallest integer 
R with the property that every vector in F 2 is within distance R from at least 
one codeword. 

As usual, we denote for x G F 2 , 

^t(x) = {y e F” I d(y,x) < t} 



and 

-5't(x) = {y G IF 2 I d{y,yi) = t}. 

The purpose of this paper is to study codes that can be used for identification 
in the following situation: Assume that 2" processors are arranged in the nodes of 
an n-dimensional hypercube. A processor can check itself and all the neighbours 
in the hypercube within Hamming distance t, and reports YES if everything is 
all right and NO if there is something wrong in the processor itself or one of 
these neighbours. We wish to find a code C consisting of some of the nodes in 
the hypercube such that if the checking is done in the corresponding processors, 
then based on the answers we receive, we can tell where the problem lies — 
assuming there is a problem in at most one of the processors. Such a code is 
called t-identifying. This problem has been studied in Karpovsky, Chakrabarty 
and Levitin [4]. 

More formally, a f-identifying code is defined as follows. 
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Definition 1. A binary code C of length n is called t-identifying (where t < n) 
if the sets Btfx.) D C, x G TFlf are nonempty and different. 

The fact that the sets St(x) are all nonempty implies that C has covering 
radius at most t. Moreover, for all x, y S IF2 , the condition Bt(x.)DC = Bt{y)DC 
implies that x = y. 

The results of this paper are from [1], where more detailed proofs of our 
results can be found. 

The minimum cardinality of a t-identifying code of length n is denoted by 
Mt{n). 



Theorem 1. Assume that C is 1-identifying. The direct sum {0, 1} 0 C is 1- 
identifying if and only if d{c, C \ {c}) < 1 for all c G C. 

Proof. Denote the code {0, 1} 0 C by D. 

If there is a codeword c G C such that d{c, C \ {c}) > 1, then clearly for 
both z = (0, c) and z = (1, c) we have i?i(z) H D = {(0, c), (1, c)} and therefore 
D is not 1-identifying. 

Assume therefore that the condition d{c,C \ {c}) < 1 always holds, and let 
z = (a,b) G where a G IF2 and b G Wlf, be arbitrary. If b ^ C, then 

there are no codewords not beginning with a in Si(z) H D; if b S C, then there 
is a unique codeword (a + l,b) not beginning with a in i?i(z) H D, and by the 
condition at least two codewords in Si(z) n D that begin with a. In both cases 
we can immediately identify z. □ 

We do not know if Ml (n) < 2Mi(n— 1) holds — or more generally, Mj^ +(3 (m0 
712) < M(j (ni)M(2(n2) — but we can prove the following slightly weaker result. 

Theorem 2. If C is 1-identifying then so is C ® {00, 01, 10, 11}. 

Proof. Assume that C C W 2 and let z = (x,y) G where x G IF2 and 

y e F2, be arbitrary. Without loss of generality, assume y = 00. If x e C, then 
i?i(z) contains the words (x, 00), (x,01), (x,10), and (x,00) is the only vector 
covered by all of them. If x ^ C, then all the codewords in Bi(z) end in y and 
we can identify z because C is 1-identifying. □ 

2 Constructions 

Theorem 3. Mi (4) = 7. 

Proof. The seven codewords 0000, 0001, 0010, 0101, 0110, 1011, 1101 form a 
1-identifying code of length four. For the proof of optimality, we refer to [1]. □ 

In the following theorems we mention the best previous upper bounds from 
[4] in parenthesis. 



Theorem 4. Mi(6) < 19 (20). 
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Proof. Take as codewords i) the four words 010000, 001000, 000010, 000001, 
ii) all the words of weight two except 110000 and its cyclic shifts and iii) all 
the words of weight five. It is not difficult to check that this code is indeed 1- 
identifying. □ 

Theorem 5. Mi(7) < 32 (40), Mi(8) < 64 (80), Mi(9) < 128. 

Proof. In this proof all pairs (x,y) denote binary vectors with x G IF^, y G IF^. 
Denote 

C= {(x,y) I w(x) = 1}\ {(00001, 00), (00010, 01), (00100, 10), (01000, 11)}. 

We claim that the sets Bi(u) n C where u = (a, b) and w{a) < 2 are nonempty 
and different. Then it is clear that the codewords of C together with their com- 
plements form a 1-identifying code with 32 words. This code has the property 
of Theorem 1, which gives the other two upper bounds. 

Let u = (a, b) be fixed. 

If w(a) = 0, then i?i(u) n C consists of four or five codewords (x, b) G C. 

If w(a) = 1, then all the codewords i?i(u) n C are of the form (a, y), and as 
we shall see, there are at least two different choices for y. If a = 10000, then y 
has the following choices: 

b|00 01 10 11 
~00 00 00 01 
y 01 01 10 10 
10 11 11 11 

If a = 01000, the choice y = 11 is not available (because (01000,11) was re- 
moved), but the sets of choices for y still remain different and all contain at 
least two elements. The same is true in the three remaining cases. 

If w{a) = 2, the (one or two) codewords in i?i(u) n C are of the form (x, b). 
If there are two codewords in Bi(u) H C , we immediately know u; if there is only 
one, we know that the vector ending in b among the removed four vectors also 
has to cover u, and we again know u. □ 

Theorem 6. For all even m > 4, 

Mi(2™ - 1) < (2™ -h 2™-i - 2)22'”-2™. 

Proof. (Sketch) Denote by the punctured Preparata code for all even to > 4. 

We take as codewords all the vectors x G IF 2 such that d(x, c) = 1 for some 
c G together with all the vectors in Hm \ Pfn obtain a 1-identifying code. 
For a detailed proof, see [1]. 

The cardinality of Vlf, is 2^ and therefore the cardinality of our code 
equals {n+ (2"*“^ — 1))2^ - 2 m^ g 



Corollary 1. Mi (15) < 5632 (5760). 



□ 
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3 Lower Bounds 

The following theorem has been proved in [4] . We prove it using a short counting 
argument, which also leads to further results. In the proof we use the concept 
of excess, cf. e.g. [3, Section 6.3]: Assume that C C F 2 has covering radius 1. If 
a vector x G W 2 is I-covered by exactly i + 1 codewords of C, we say that the 
excess A(x) (by C) on x is i. In general, the excess E{V) on a subset V C F 2 
is defined by 

E{V) = 

xey 



Theorem 7. 



Mi{n) > 



n2” 

V(n,2)- 



Proof. Assume that C is a 1-identifying code with K = Mi{n) codewords. There 
may be some at most K points in the space F 2 covered by a unique codeword; for 
all the other points x we have E{x) > 1. Every point x with E{x) = 1 is called a 
son; every point x with E(x) > 1 is called a, father. Every son has a unique father 
defined as follows. A son x is covered by exactly two codewords, who necessarily 
have distance 1 or 2 from each other. Consequently, they both 1-cover exactly one 
other point, which is called the father of x. Since C is 1-identifying, the father 
must be covered by at least one more codeword and therefore automatically has 
excess at least 2. 

We now divide the space into families: all the sons that have a common 
father, form a family with their father. The families are disjoint; the families 
together with the uniquely covered points partition the whole space. There may 
be fathers with no sons. 

If the father of a family is covered by exactly i codewords, then it can have 
at most ( 2 ) sons. Indeed, each son is covered by exactly two codewords that also 
cover the father. The average excess on the points in a family whose father is 
covered by exactly i > 3 codewords is therefore at least 



/(*) ■— ((2) + * ~ ^)/ ((2) + ^)- 



This is a decreasing function on z > 4; and /(3) = /(6). Assume that n > 6. 
Then /(z) > /(rz) whenever 3 < i < n. 

If there were a codeword c G C such that Bi{c) C C then we could remove 
c from the code without violating the identification property. Since K = Mi(n), 
this is not the case. In other words, there are no families with z = n -I- 1. 

The total excess ^(F^) trivially equals Al(n -I- 1) — 2". We now estimate it 
in a different way. 

The uniquely covered points contribute nothing. The number of such points 
is at most K. Apart from the uniquely covered points, the remaining points form 
families, each of which has average excess at least /(n). Therefore 



K(zz + 1)-2">(2"-X)/(zz), 
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The claimed lower bound on K now follows. For n = 5, the argument gives 
6K — 2® > (2® — K)f{3) and hence K > 10, so the formula of the theorem also 
holds when n = 5. The formula also holds for n = 4 by Theorem 3, and it is 
easy to see that it is also valid for n = 2 and n = 3. □ 

A code C is called e- error- correcting if d(a, b) > 2e + 1 whenever a,h G C, 
a yf b. An e-error-correcting code is called perfect if it has covering radius e. 



Theorem 8. Assume that n > 2. Equality holds in the previous theorem if and 
only if there exists a perfect 2- error- correcting code of length n. 



Proof. (Sketch) Assume that n > 7. It has been proved in [4] that if there exists 
a perfect 2-error-correcting code of length n, then equality holds in Theorem 7. 
From the previous proof we see that if equality holds, there are no families with 
i = n-|-lort<n, i.e., all the families have i = n and, moreover, consist of 
( 2 ) + 1 points. 

Assume then that a code C attains the bound with equality. If the distance 
between any two fathers is less than five, it is not difficult to show that at least 
one of them does not have enough sons. 

To prove that D is perfect, we show that the covering radius of D is at most 
two. Let z e IF 2 be arbitrary. We claim that d(z, D) < 2. If z itself is a father or a 
son, there is nothing to prove. Assume that z is 1-covered by a unique codeword 
c of C. Because Bi(z) n C yf i?i(c) n C, we know that c is covered by another 
codeword c' G C. However, since Bi{c) n C and Bi(c') n C are different and 
both contain c and c', at least one of them contains another codeword c" G C, 
i.e., either c or c' is a father. □ 



It is well known that there is no perfect 2-error-correcting code of length 90, 
and therefore 



Ml (90) > 



90 • 2^0 
H(90,2) 



90 • 2^® 



The following optimality result gives a case where the argument of the pre- 
vious theorem can be sharpened. 



Theorem 9. Mi(7) = 32. 

Proof. (Sketch) By Theorem 5 it suffices to prove that Mi (7) > 32. 

Step 1: By Theorem 7, Mi (7) > 31. Assume that C is a 1-identifying code 
of length seven with K = 51 codewords. From the proof of Theorem 7 we know 
that if there are less than K points x G IF 2 with E(x) = 0, then 



120 = K(n -h 1) - 2” > (2” - a: -h l)/(7) > 120, 



which is a contradiction. Hence for every c G C there is a vector x G Bi{c) such 
that A(x) = 0. 

Step 2: The 97 points with excess at least one are divided into families, 
say iFi, IF 2 , . . . , iFk with fathers yi, y 2 , . . . , y^, respectively. A careful analysis 
similar to the one used in the proof of Theorem 7 in this specific shows that we 
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may assume that T\^ T 2 , T-i are families with if(yi) = E{y 2 ) = E{y^) = 6 and 
all have at least 21 elements. 

Step 3: Moreover, since n = 7, two of these three fathers are at most distance 
four apart, say d(yi,y 2 ) < 4. It is not difficult to show that this leads to a 
contradiction with the fact that the families are so large or with the observation 
made in Step 1. □ 

The following table presents the currently known bounds on Mi(n) when 
n < 9. For results on M 2 (n), we refer to [4] and [2]. 



n 


Mi{n) 


■3 


a 4 a 


4 


b7b 




c 10 a 


6 


c 18-19 e 


7 


d 32 f 


8 


c 56-64 f 


9 


c 101-128 f 



Key 

a Karpovsky et al. [4] 
b Theorem 3 

c Theorem 7 (which is from [4]) 
d Theorem 9 
e Theorem 4 
f Theorem 5 
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Abstract. An algorithm is presented allowing the construction of fast 
Fourier transforms for any solvable group on a classical computer. The 
special structure of the recursion formula being the core of this algorithm 
makes it a good starting point to obtain systematically fast Fourier trans- 
forms for solvable groups on a quantum computer. The inherent struc- 
ture of the Hilbert space imposed by the qubit architecture suggests to 
consider groups of order 2" first (where n is the number of qubits). As 
an example, fast quantum Fourier transforms for all 4 classes of non- 
abelian 2-groups with cyclic normal subgroup of index 2 are explicitly 
constructed in terms of quantum circuits. The (quantum) complexity of 
the Fourier transform for these groups of size 2" is O(n^) in all cases. 



1 Introduction 

Quantum algorithms are a recent subject and possibly of central importance in 
physics and computer science. It has been shown that there are problems on 
which a putative quantum computer could outperform every classical computer. 
A striking example is Shor’s factoring algorithm (see [27]). 

Here we address a problem used as a subroutine in almost all known quan- 
tum algorithms: The quantum Fourier transform (QFT) and its generalization 
to arbitrary finite groups. In classical computation there exist elaborate meth- 
ods for the construction of Fourier transforms (e. g., [3], [4], [5], [6], [10], [19]), 
therefore it is highly interesting to adapt and modify these methods to get a 
quantum algorithm for the Fourier transform with a much better performance 
(with respect to the quantum complexity model, see Section 3) than any classical 
algorithm. First attempts in this direction have been proposed by Beals [2] and 
Hpyer [16]. In this paper we present an algebraic approach using representation 
theory which can be seen as a first step towards the realization of a large class 
of generalized Fourier transforms on a quantum computer. 
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2 Representation Theory and Fourier Transforms 

Fourier transforms for finite groups are an interesting and well studied topic for 
classical computers. We refer to [3], [6], [19], [24] as representatives for a vast 
number of publications. The reader not familiar with the standard notations 
concerning group representations should refer to these publications or to stan- 
dard references like [9] or [26]. For the convenience of the reader we will briefly 
present the terms and notations from representation theory which we are going 
to use and recall the definition of Fourier transforms. 

A representation of a finite group G of degree deg((/)) = n is a homomorphism 
4> : G GL„(]K) from G into the group of invertible (n x n)-matrices over a 
held K. We denote by 1 g : 5 1 the trivial representation of G (of degree 1). 

If A S GL„(K), then (j)^ : g A~^ • (j){9) ' ^ is the conjugate of (j) hy A. (j) 
and Ip are called equivalent, ii (j) = xl)^. If <j),ip are representations of G, then the 
representation : g ^ 4>{g) © V'(ff) = ( ^ ) is called the direct sum of 

(p and Ip. (j) is called irreducible, if it cannot be conjugated to be a direct sum. In 
this paper, we will deal only with ordinary representations, i. e. the characteristic 
of K does not divide the group order |G| (Maschke condition). In this case, every 
representation (p can be conjugated, by a suitable matrix A, to a direct sum of 
irreducible representations pi (Maschke’s theorem), i.e. (p^ = which is 

called a decomposition of (p and A is referred to as a decomposition matrix for 
p. Let 0 be a representation oi H < G, and p a representation of G which is 
equal to p when restricted to H {p I H = p). Then p is called an extension of 
p to G and p is called extensible (to G). Note, that an extension does not exist 
in general. If is a representation oi H < G and t G G then p* : p{tht~^) 

is a representation of called the inner conjugate oi p hy t. li H < G is a 
subgroup with transversal (i. e. a system of representatives of the right cosets of 

in G) T = {ti,...,tk), then {p |t G){g) = [p{tigt~^) \ i,j = l...n], where 
p{x) = p{x) for X G H and the all-zero matrix else, is called the induction of 
p to G with transversal T. A regular representation is an induction of the form 
P = {is pT G) where E denotes the trivial subgroup of G. 

Let p he & regular representation of G. A Fourier transform for G is any 
decomposition matrix A oi p with the additional property that equivalent irre- 
ducibles in the corresponding decomposition are even equal. Note, that the defini- 
tion says nothing about the transversal fixing p, nor the choice of the irreducible 
representations. As an example let G = = (x ] x” = 1) be the cyclic group 

of order n with regular representation p = 1 e jr G, T = (x°, . . . , a;"“^), 
and ujn a primitive nth root of unity. Then p^ = Pi, where pi : x ^ 

and A = DFT„ = ] i,j = 0 . . .n — 1] is the (unitary) discrete Fourier 

transform well-known from signal processing. 

If A is a Fourier transform for the group G, then we will refer to a fast Fourier 
transform as any fast algorithm for the multiplication with A. Of course, the 
term “fast” depends on the chosen complexity model. Since we are primarily 
interested in the realization of a fast Fourier transform on a quantum computer 
(QFT) we first have to define the measure of complexity on this architecture. 
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3 The Complexity Model Used in Quantum Computing 

Quantum computing is a topic of recent interest which emerged after the dis- 
covery of polynomial algorithms for integer factoring and discrete logarithms by 
P. Shor (see [27]). The state of a quantum computer is given by a normalized 
vector in a Hilbert space of dimension 2", which is given the natural tensor 
structure Tin = 0 . . . 0 (n factors) . The restriction of the computational 

space to Hilbert spaces of this particular form is motivated by the idea of a 
quantum register consisting of n quantum bits. A quantum bit, also called qubit, 
is a state corresponding to one tensor component of and has the form 

|(p) = ajO) -h/3|l), |ap-h|/3p = l, a,(3eC. 

The possible operations this computer can carry out are the elements of the uni- 
tary group 7/(2"). To study the complexity of performing unitary operations on 
n-qubit quantum systems we introduce the following two types of computational 
primitives (see also [15], this volume): Local unitary operations on a qubit i are 
matrices of the form := 12^-1 0 C/® where U is an element of the uni- 

tary group 7/(2) of 2 X 2-matrices. Furthermore we need operations which affect 
two qubits at a time, the standard choice for which is the so-called controlled 
NOT gate (also called measurement gate) between the qubits i (control) and j 
(target) defined by 

CN0T(*’J) 1 1 

when restricted to the tensor component of the Hilbert space spanned by the 
qubits i and j. We assume that these so-called elementary quantum gates can 
be performed with cost 0(1). 

In the graphical notation using quantum wires (for details see [1]) these 
transforms are written as shown in Figure 1. The lines correspond to the qubits, 
unaffected qubits are omitted, and the dot • sitting on a wire denotes the control 
bit. 



[/«= — 1 


II 


i CNOT^^’b = 






' 



Fig. 1. Elementary quantum gates. 

These two types of gates suffice to generate all unitary transformations, which 
is the content of the following theorem from [1]. 

Theorem 1. The set G = CNOT^*’^') | U G 7/(2), i,j = l...n, i ^ j} is 

a generating set for the unitary group 7/(2"). 

This means that for each U G 7/(2") there is a word W\W 2 ■ ■ - Wk (where Wi G Q 
for / = 1 . . . fc is an elementary gate) such that U factorizes as U = W\W 2 ■ ■ ■ Wk- 
In general only exponential upper bounds for the minimal length occurring in 
factorizations have been proved (see [1]) but there are many interesting classes 
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of unitary matrices in Z^(2") affording only polylogarithmic word length, which 
means that the minimal length is asymptotically 0(p(n)) where p is a polyno- 
mial. In the following, we give examples of some particular unitary transforms 
admitting short factorizations which will be useful in the rest of the paper. 

— The symmetric group Sn is embedded in Z^(2") by the natural operation of 
Sn on the tensor components (qubits) . Let t G Sn and Ur the corresponding 
permutation matrix on 2" points. Then Ut has a 0{n) factorization as shown 
in [22]. As an example in Figure 2 the permutation (1,3,2) of the qubits 
(which corresponds to the permutation (2, 5, 3)(4, 6, 7) on the register) is 
factored according to (1,3,2) = (1,2)(2, 3). 




« « c 


^ 

i 


5 ' 

^ 








) — ( 


^ 

? 


[7 ^ 





Fig. 2. Factorization (1,3,2) = (1,2)(2,3). 

— Following the notation of [1] we denote a /c-times controlled U by Ak{U). 
As an example for the graphical notation we give in Figure 3 a Ai(U) gate 
for arbitrary U G 7/(2") with normal, and a gate with inverted control bit 
including the represented matrix. Lemma 7.2 and Lemma 7.5 in [1] show that 
for U e 7/(2) the gate Ak{U) can be realized with gate complexity 0{n), for 
fc < n — 1, and A„_i(/7) in 0{vA). 







n+1 


1 


— 




n 




dl(C) = l2» ©f/ = . 


u 


U 0 = 


u 


— 




1 





Fig. 3. Controlled gates with a) normal and b) inverted control bit. 

— The Fourier transform DFT 2 n can be performed in 0{n^) elementary oper- 
ations on a quantum computer (see [27], [8]). 

— Let Pn G S' 2 " be the cyclic shift acting on the states of the quantum register 
as X 1 -^- cc -|- 1 mod 2". The corresponding permutation matrix is the 2"- 
cycle (1, 2, . . . , 2"). P„ can be realized using O(n^) basic operations (see 
[13], Section 4.4). 

Let U G 7/(2"). The cost for a (single) controlled U is settled by the following 

Lemma 1. If U G 7/(2") can he realized in 0{p{n)) elementary operations then 
Ai{U) g7/(2"+^) can also he realized in 0{p{n)) basic operations. 

Proof: First we assume without loss of generality that U is written in elementary 
gates. Therefore we have to show that a double controlled NOT and a single 
controlled U G 7/(2) can be realized with a constant increase of length. This 
follows from [1]. □ 
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4 Creating Fast Fourier Transforms 

In Section 2 we have explained that a Fourier transform for a group G is a 
decomposition matrix B for a regular representation <p oi G with the additional 
property that equivalent irreducible summands are equal, i.e. 

(t>^ = Pi® ■ ■ Pk fulfilling Pi = Pj^Pi = Pj. 

A “fast” Fourier transform (on a classical computer) is given by a factorization 
of B into a product of sparse matrices, (see [3], [6], [19], [25]). For a solvable 
group G, this factorization can be obtained recursively using the following idea. 
First, a normal subgroup of prime index (G : N) = pis chosen. Using transitivity 
of induction, (f> = 1 e '[ G is written as (1^; | -^) T C (note, that we have the 
freedom to choose the transversals appropriately) . Then 1^; | A^, which again is a 
regular representation, is decomposed (by recursion) yielding a Fourier transform 
A for N. In the last step, B is derived from A using a recursion formula. 

In the following, we will explain this procedure in more detail by first pre- 
senting the two essential theorems (without proof) and then stating the actual 
algorithm for deriving fast Fourier transforms for solvable groups. The special 
tensor structure of the recursion formula mentioned above will allow us to use 
the very same algorithm as a starting point to also obtain fast quantum Fourier 
transforms in the case G being a 2-group (i.e. |G| is a power of 2). 

The statements in this section are all taken from the first chapter of [24] 
where decomposition matrices for arbitrary monomial representations in general 
are investigated. The first thing we need is Clifford’s Theorem which explains 
the relationship between the irreducible representations of N and those of G. 

Theorem 2 (Clifford). Let N < G be a normal subgroup of prime index p 
with (cyclic) transversal T = {t^ ,t^ j ■ ■ ■ 0 ,'nd denote by \ \ t ^ Wp, 

i = Q . . .p— 1, the p irreducible representations of G arising from G/N. Assume 
p is an irreducible representation of N. Then exactly one of the two following 
cases applies: 

1. p=p* and p has p pairwise inequivalent extensions to G. Ifp is one of them, 
then all are given by Xi • p, i = 0 ... p — 1. 

2. p p* and p (t G is irreducible. Furthermore, {p \t G) [ N = p** 

and 

(A • (P Tt G))^®"^ = p Tt G, D = diag(l,o.p, . . . 

The following theorem provides the recursion formula which had already been 
used in [3] to obtain fast Fourier transforms. 

Theorem 3. Let N <G be a normal subgroup of prime index p with transversal 
T = (fi ,f^ ,. . . , and 4> a representation of degree d of N. Suppose that A 

is matrix decomposing <j) into irreducibles, i. e. = p = pi ® ... ® Pk and that 
p is an extension of p to G. Then 

p-i 

{4> Tt G)^ = ^ Xi ■ p, 

i=0 
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where Xt : t ^ ujp, i = 0 ... p — 1, are the p irreducible representations of G 
arising from the factor group G/N, 

p-i 

B = {Ip A) ■ D ■ (DFTp Id), and D^^^p{ty. 

i=0 

If in particular, J) is a direct sum of irreducihles, then B is a decomposition 
matrix of (f tr G. 

In case of an cyclic group G the formula yields exactly the well-known Cooley- 
Tukey decomposition, [7], in which D is usually called the Twiddle matrix. 

Assume that < G is a normal subgroup of prime index p with Fourier 
transform A and decomposition = p = We can reorder the pi, 

such that the first, say k, pt have an extension pj to G and the other pi occur 
as sequences p* 0 p- 0 ... 0 pf” of inner conjugates (cf. Theorem 2, note 
that irreducihles pi, pf have the same multiplicity since (f> is regular). In the 
first case the extension may be calculated by Minkwitz’ formula, [21], in the 
latter case each sequence can be extended by pi |t G (Theorem 2, Case 2). We 
do not state Minkwitz’ formula here, since we will not need it in the special 
cases treated in Section 5. Altogether we obtain an extension p of p and can 
apply Theorem 3. The remaining task is to assure, that equivalent irreducihles 
in Xi ■ p are equal. For summands of p of the form p^ we have that Xj ■ p^ 

and Pj are inequivalent and hence there is nothing to do. For summands of p of 
the form pi '[t G, we conjugate Xj ■ {pi tr G) onto pi tr G using Theorem 2, 
Case 2. 

Now we are ready to formulate the recursive algorithm for constructing a 
fast Fourier transform for a solvable group G. 

Algorithm 1. Let N <G a normal subgroup of prime index p with transversal 
T = {t^ , . . . Suppose that cf is a regular representation of N with 

(fast) Fourier transform A, i. e. = pi 0 . . . 0 pk, fulfilling pi = pj ^ pi = pj. 
A Fourier transform B of G with respect to the regular representation cf '\t G 
can be obtained as follows. 

1. Determine a permutation matrix P rearranging the pi, i = 1 . . . k, such that 
the extensible pi (i. e. those satisfying p^ = pf) come first followed by the oth- 
ers ordered into sequences of length p equivalent to pi, p-, ... , pf” . (Note: 
These sequences need to be equal to pi, p‘, . . . , pf which is established in 
the next step). 

2. Calculate a matrix M which is the identity on the extensibles and conjugates 
the sequences of length p to make them equal to pi,p\,..., pf’’ ' . 

3. Note that A- P- M is a decomposition matrix for cf, too, and let p = cf^'^'^ . 
Extend p to G summandwise. For the extensible summands use Minkwitz’ 
formula, the sequences pi, p\, . . . , p\ can be extended by pi G. 

p-i 

4. Evaluate p at t and build D = 0 pW- 
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5. Construct a blockdiagonal matrix C with Theorem 2, Case 2, conjugating 
©U Xi ■ p such that equivalent irreducibles are equal. C is the identity on 
the extended summands. 

Result: 



B={lp<»A-P-M)-D-{nFTp<»l^M\)-C ( 1 ) 

is a fast Fourier transform for G. □ 

It is obviously possible to construct fast Fourier transforms on a classical com- 
puter for any solvable group by recursive use of this algorithm. 

Since we restrict ourselves to the case of a quantum computer consisting of 
qubits, i. e. two-level systems, we apply Algorithm 1 to obtain QFTs for 2-groups 
(|G| = 2”, p = 2). In this case the two tensor products occurring in (I) fit very 
well to yield a coarse factorization as shown in Figure 4. The lines correspond to 
the qubits like in Section 3 and a box ranging over more than one line denotes 
a matrix admitting no a priori factorization into a tensor product. 

The remaining problem is the realization of the matrices A, P, M, H, C in 
terms of elementary building blocks as presented in Section 3. At present, how- 
ever, this realization remains a creative process which might be performed by 
hand if a certain class of groups is given. In Section 5 we will apply Algorithm 1 
to a class of non-abelian 2-groups. 




Fig. 4. Coarse quantum circuit visualizing Algorithm 1 for 2-groups. 



5 Generating QFTs for a Class of 2-Groups 

In the case of G being an abelian 2-group the realization of a fast quantum 
Fourier transform has been settled by [18]. This case is also covered by the 
method presented here. In this section we will apply Algorithm 1 to the class 
of non-abelian 2-groups containing a cyclic normal subgroup of index 2. Fast 
quantum Fourier transforms for these groups have already been constructed by 
Hpyer in [16]. 

According to [17], p. 90/91, there are for n > 3 exactly four isomorphism 
types of non-abelian groups of order 2”“''^ affording a cyclic normal subgroup of 
order 2”: 

1. The dihedral group £> 2 "+i = {x,y \ = 1, = x~^). 

2. The quaternion group (52"+i = {x,y\x‘^ = 2/^ = Ij x'^ = x~^). 

3. The group QP 2 ^^+i = {x,y \ x^" = = 1, x^ = x^" 

4. The quasi-dihedral group (57?2"+i = {x,y \ x"^ = y^ = 1, x^ = x^ 
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Observe that the extensions 1, 3, and 4 of the cyclic subgroup = (x) split, 

i. e. the groups have the structure of a semidirect product of Z2" by Z^. The three 
isomorphism types correspond to the three different embeddings of Z2 = (y) into 
{Z^r^Y =^2 X Z2»-2. 

5.1 QFT for the Dihedral Groups D 2 n+i 

In this section we construct a QFT for the dihedral groups D2n+i step by step 
according to Algorithm 1 and explicitly state the occurring quantum circuits. 

Let G = D2n+i = {x,y \ = x~^) with normal subgroup 

N = (x) < G of index 2 and transversal T = (l,y). We consider the regular 
representation (j) = (1^; ts -^) Tr G of G with A = (l,x, . . . ,x^"“^). Obviously 
the regular representation (l^; Is A^) of is decomposed by A = DFT2» into 
Po ® • • • ® P2”-i where pi : x ^ uiY ■ Now we are ready to apply Algorithm 1 to 
obtain a decomposition matrix B for (f). For convenience we denote 012^ simply 
as a; and FT = DFT2 = ^ • (} _i). 

1. Since pf(x) = pi{yxy~^) = pi{x~^) = p2"-i(x) we see that there are exactly 
two extensible pi namely for i = 0,2"“^. The sequences of inner conjugates 
are given by pi, p2"-i, * 7^ 0, 2"“^. Thus we need a permutation P reordering 
the Pi as 

PO, P2"-i^ ; pi, P2"-1, ■■ ■ ,Pi, P2"-z, ■ ■ ■ , P2"-i-l; P2"-i+l ■ 

extensibles pairs of inner conjugates 

This can be accomplished by the circuit given in Figure 5 since the n-cycle 
on the qubits which is performed first yields a decimation by two on the 
indices, i.e. the indices 0 ,... ,2"“^ — 1 have found their correct position. 
The only thing which remains to do is to perform the operation x — x on 
the odd positions. This can be done by an inversion of all (odd) bits followed 
by a X X + 1 shift P„_i on the odd states of the register. 




Fig. 5. Ordering the irreducibles of Z2« < Z?2n+i . 

2. M can be omitted since all the pi are of degree 1. 

3. Let = p. We extend p summandwise to p: 

— po = Iat can be extended by 1g- 

— p2*»-i can be extended through P2n-i(j/) = 1. 

— The sequences pi © p2"-i, i yf 0,2”“^ can be extended by pi |t G and 
(p. Tt G)(p) = (?i). 
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4. Evaluation of p at the transversal T yields the Twiddle matrix 
74 = p(l)0p(j/) =l2n0l2 ©(?!)©... 0(0 1), 

2n-l_i 

which is realized by the quantum circuit given in Figure 6. 
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Fig. 6 . Twiddle matrix for D2n+i . Fig. 7 . Equalizing inductions. 

5. According to Theorem 2, Case 2, the matrix C has the following diagonal 
form: 

C = l 2 " 0 diag(l, 1, 1, -1, , 1,-1), 

— 1 pairs 

which is realized by the quantum circuit given in Figure 7. 



We obtain that B = (Ip ® A - P ■ M) ■ D ■ (DFTp 0 1|at|) • C is a decomposition 
matrix for <j) and a fast quantum Fourier transform for G. The whole circuit is 
shown in Figure 8. 




Fig. 8 . Complete QFT circuit for the dihedral group D 2 n+i. 



5.2 QFT for the Groups Q211+1, QP2n+i, and QD2n+i 

In the following we give the circuits for the groups Q 2 »+i, QT 2 "+i, and QD 2 «+i. 
In all cases we have (x) = N <G so that Algorithm 1 has to be performed only 
once for the last step. For the sake of brevity we will state only those parts of 
the circuit which differ from the dihedral group. We use the same notation as in 
the last section. 

Q 2 ^+i : The irreducibles pi extend or induce in the same way as in the dihe- 
dral case. Hence the QFT only differs in the Twiddle matrix D since for a not 
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extensible pi we have {pi Tt G)(y) = ( _° J ) . Thus D is given by 
T» = l2n0l2©(_?J)©...©(_?J) 

' V " 

and can be realized by the circuit in Figure 9. 




Fig. 9. Twiddle matrix for Q2n+i. Fig. 10. Permutation for QP2n+i. 

QP 2 n+i: To determine which pi are extensible we observe p^{x) = pi{yxy~^) = 
Pi{x^" Hence pi = pf ^ ^ = 1 2 | z and there 

are exactly 2"“^ extensible pi. The reordering permutation P has the easy form 
shown in Figure 10, and the matrix D is given by 

D=l2n©l2n-l©(?J)©...©(?J), 

2"-2 

which is simply a double controlled NOT as visualized in Figure 11. The matrix 
C then is given by Figure 12. 





Fig. 11. Twiddle matrix for QP2n+i. Fig. 12. Equalizing for QP2n+i. 

QD 2 n+i: Here we have p^{x) = pi{x‘^’' and 

p^=py^CU^ = ^ ^i-(2"-i-2) = 1 ^ j = 0,2"-\ 



Thus everything is the same as in the dihedral case beside the ordering permu- 
tation P which takes the more complicated form shown in Figure 13. 

Concerning the complexity of these QFTs we have the following theorem. 

Theorem 4. The Fourier transforms for the groups G = U 2 ", Q 2 ", QT 2 "; 
and QD 2 ^ can he performed on a quantum computer in 0(log^ |G|) elementary 
operations. 
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Proof: We can treat the four series uniformly, since the Fourier transforms all 
have the same decomposition pattern. First, in all cases a Fourier transform for 
the normal subgroup is performed with cost of O(n^) basic operations. 

The reordering permutation P, the Twiddle matrix D, and the equalizing matrix 
C cost O(n^) in case of H 2 ", Q 2 ", and QII 2 " due to Lemma 1 and the examples 
in Section 3. For Q-P 2 " we need only 0(1) operations for P, D, and C. □ 

All presented Fourier transforms have been implemented by the authors in the 
language GAP [14] using the share package AREP [12]. 

6 Conclusions and Outlook 

A constructive algorithm has been presented allowing to attack the problem of 
constructing fast Fourier transforms for 2-groups G on a quantum computer built 
up from qubits. For a certain class of non-abelian 2-groups the algorithm has 
been successfully applied. All the QFTs created are of computational complexity 
0(log^ |G|) like in the case of the cyclic group 2'2~. The main problem imposed 
by the implementation of certain permutation and block diagonal matrices has 
been solved efficiently. 

Using the recursion formula from Theorem 3 it should be possible to construct 
QFTs for other classes of groups as well as to realize certain signal transforms 
on a quantum computer by means of symmetry-based decomposition (see [24], 
[ 11 ], [ 20 ]). 
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Abstract. We investigate the role of matrix rings in coding theory. Us- 
ing these rings we introduce an embedding technique which enables us 
to give new interpretations for some previous results about algebraic 
representations of some prominent linear codes. 



Introduction 

Codes over rings have been enjoying growing importance during the recent 
decade. This is based on the observation by Nechaev [13] and Hammons et al. 
[9] that certain non-linear binary codes (Kerdock, Preparata and further Codes) 
of high quality are linear over 2Zi^. Since then various papers have been pub- 
lished many of them dealing with codes over 2Zi^ but also involving other rings 
(cf. [8,3,2]). 

Matrix rings are a particular class of non-commutative alphabets which — 
apart from [2] — have not yet been involved in coding theory. In what follows 
we therefore first describe relations of coding theory over the latter class of 
rings to that over their base rings. Subsequently we consider embeddings of 
given rings into matrix rings and study their properties. To get prepared for 
the last section we consider matrix representations of algebras. Extensions of 
Skolem-Noether type theorems based on results by Nechaev [12] show that under 
suitable conditions an embedding of a A-algebra R into a matrix algebra Mm{K) 
is unique up to inner automorphisms of the target algebra, provided K is a, 
commutative Artinian local ring. 

In the last section we apply the preparations of the preceeding ones and 
provide a different view on several known code representations via matrix rings. 
In particular our considerations shed new light on the results in Wolfmann [16] 
and Pasquier [14] (cf. also Goldberg [7]) and yield similar statements for extended 
binary Hamming Codes and the Octacode. 

In what follows the reader is supposed to be familiar with basic facts of ring 
and module theory as well as order theory. We will frequently denote the set of 
all linear (left) codes of length n over a ring K — i.e. the lattice of all submodules 
of icAT” — by L{kK'^)- Hence, our use of the term lattice is the order theoretical 
one rather than the one usually referred to in coding theory. 
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1 Linear Codes under Morita Equivalence 

Let K he & ring, m a positive integer, and let S denote the full ring of all 
(m X m)-matrices over K. We establish a connection between linear codes over 
K and those over S referring to what is known as Morita equivalence. 

For a module kM let S operate on the abelian group M™ via: 

SxM”^ — > 

3 

where the index domain of (aij)ij and {mj)j are thought to be understood. 
Hence, S operates in the same way on M™, as it operates on iC™; the only 
difference is that the matrix elements are multiplied with module elements rather 
than ring elements. 

Lemma 1. For every module sN there exists a module kM such that sN is 
isomorphic to sM'^. 

The following lemma clarifies the connection between the submodule lattices 
involved. For an element a: of a module kM let denote the m-fold concate- 
nation of X, i.e. the element (a;i, . . . , Xm) G Tf™ with Xi = a; for alH = 1, . . . , to. 

Proposition 1. 1. The mapping M — > with x i — > a:^™^ is a semilinear 

embedding where the accompanying ring homomorphism K — > S is given 
by the natural embedding of K in S with A Xlm ■ 

2. Its induced lattice mapping L{kM) — > L{sM'^) is given by C i — > C™ and 
is an isomorphism. 

Proof. It is easily verified that the mapping in question is additive and that 
(Xx)^ = {XIm)x^'^\ for all A G iL and x G M, which is our first claim. The 
second claim is a direct consequence. □ 

We are interested in how the property of being free carries over to the re- 
spective module. 

Lemma 2. Let S be the ring of (jn x m) -matrices over the ring K, let kM and 
sN be modules with sN ~ 5 M™. Then sN is free of rank n if and only if kM 
is free of rank nm. 

For the coding theoretical context we now summarize what we have seen in 
a theorem: 

Theorem 1. For a ring K and its ring of all (jn x m)-matrices Mm{K) there 
is a 1 — 1-correspondence between linear codes of length mn over K and linear 
codes of length n over Mm{K). This correspondence is induced by (semilin early) 
embedding the former as their m-fold concatenations into the latter. 
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Remark 1. There is a natural relation between the respective decoding problems: 

1 . Any decoding scheme A for a AT-linear code C can be extended to a scheme 
A* used to decode its Mm(Ar)-linear image C™ under the natural embedding 
established above. It proceeds in m steps at the cost of one application 
of A per step. It corrects any matrix error pattern each full row of which 
isicorrectable by A. 

2. Any decoding scheme A* for an Mm(Ar)-linear code C™ can be reduced 
to a scheme A used to decode its AT-linear preimage C under the natural 
embedding. It proceeds at the cost of one application of A. It corrects any 
error pattern e the m-fold concatenation of which is correctable by A* . 

The correspondence between codes over rings and those over their matrix 
rings is purely algebraic so far. It does not imply any further going equivalence 
since it does not involve any metrical aspect. The following considerations are 
devoted to this point. We establish a weight function on Mm{K) for which the 
semilinear embedding mentioned in Prop. 1 is an isometry. 

Let again S denote the ring of all (m x m)-matrices over the ring K, and 
let wk '■ K — > IR be a weight function on K, which shall be thought to be 
completed additively on AT™. We define a function 

ws '■ S — > IR 

A I — > max WK^Ai), 

where Ai denotes the ith row of A. This kind of row-sum norm is known to satisfy 
the triangle inequation and is strictly positive. We have furthermore ws(A) = 
ws{-A) for all A G S'. 

As usual we complete ws additively to S" (n G IN). Then our definition gets 
justified by the following statement. 

Proposition 2. Let wk be a weight function on K and ws the just defined 
weight function on S. The semilinear embedding kTC^"^ — > sS" with x i — > 
is an isometry of ,wk) into (S",ws). 

Proof. It is easily verified that = wk{x) for all x G AT”™. □ 

For the set of all linear Codes of length nm over AT and those of length n 
over S we are now able to prove the following: 

Theorem 2. Let K be a unital ring, wk a weight function on AT and S the ring 
of all {m X m) -matrices over K together with the above-defined weight function 
Ws- For every natural number n the lattice isomorphism A(/fA'"’”) — > L{sS'^) 
induced by the isometry AT"™ — > S" preserves minimum distance. 

Proof. Let C be a AT-linear code of length nm, and let C"" be its S-linear image 
under the lattice mapping in question. If x is a word of minimal A"- weight in 
C, then by Prop. 2 we have ws{x^'^) = wk{x) and therefore ds{C'^) < dK{C). 
Conversely, if y is a word of minimal S- weight in C™, then each full row yi 
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{i = 1, . . . , m) of this matrix word is contained in C and satisfies by definition 
of ws clearly WK{yi) < ws{y)- This however shows, that dK{C) < ds{C"^), and 
therefore our claim follows. □ 

We are now able to refine our statements in Rem. 1. 

Theorem 3. Let K he a unital ring, wk a weight function on K and S the ring 
of all {m X m) -matrices over K together with the row-sum weight function ws- 
Let furthermore A be a decoding scheme for a K -linear code C of length nm and 
A* its extension for the S -linear code C™. Then A is bounded distance (with 
respect to wk) if and only if A* is bounded distance (with respect to ws)- 

Proof Let first A correct error patterns of iC-weight up to t, and let y be an S- 
error pattern of S'- weight s < t- As we saw in the proof of Thm. 2 the iC-weight 
of each full row of y is at most s, and hence it can be corrected by A. But this 
shows that A* corrects S-error patterns of S-weight upt to t- Conversely, if A* 
corrects error patterns of S-weight up to t, and x is a iC-error pattern of K- 
weight s < t, then by Prop. 2 combined with = wk{x) it is immediate 

that can be corrected by A*, and therefore x can be corrected by A. □ 

2 Code Correspondences Induced by Ring Embeddings 

The situation of the foregoing section changes significantly when considering 
more general embeddings. 

Let (f : R — > S be a unital ring embedding and let n be a natural num- 
ber. The given embedding clearly induces a semilinear embedding ip : /ji?" — > 
5 S" defined componentwise and the latter gives rise to a lattice mapping 'ip : 
— > A(sS") with C ^ s{p{C))- 

This mapping need not be injective, as the natural embedding of Z into (Q 
shows. It is immediate, however, that ip is (completely) join-preserving, i.e. for 
any family {Ci)i^i of submodules of _rR" there holds C) = 

Under appropriate assumptions on the relation between R and S an according 
statement for finite meets can be proved. Recall that S obtains the structure of 
an i?-bimodule by setting r • s := ip{r)s and s • r := sipfr), and it is clear that 
{r ■ s) ■ P = r • {s • r') for all r,P G R and s G S- Keeping this in mind we find it 
convenient to write '<p{C) = S 0^ C- 

Lemma 3. If Sr is aflat module, then ip : L(rR^) — > L(sS'") is (finite) meet 
preserving, i-C- ip{C H D) = ip{C) n ip{D) for all C,D G L{rR^). 

Proof cf. [4, §2.6]. □ 

The last statement is in particular valid, if Sr is a projective module. 

In the coding theoretical context the flatness of Sr might be important under 
a different aspect: Let an i?-linear left code C of length n possess a, {k x n)- 
generator matrix G and an (m x n)-check matrix H. We have C = Im(G), which 
is usually represented by the short exact sequence 

0 — > ker(G) — > R^ C — > 0. 
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Applying ip to this sequence and subsequent tensoring with sSr and ids results 
in p{C) = Im((^(G)) because of being right exact. Therefore p maps genera- 
tor matrices for C to such for p{C). On the other hand the equality C = ker(i/‘) 
may be represented by the sequence 

0 — > C — > i?” Im(i7‘) — > 0, 

the exactness of which is not preserved in general under the above manipulations. 
If however Sr is presumed to be flat then p{C) = ker{p{H^)), which shows that 
p maps check matrices for C to such for p{C). 

Next, we clarify under which conditions p is injective, and hence an order 
embedding. It is known, that tensoring preserves direct decompositions (cf. [1, 
Lemma 19.9]), i.e. if 0^^/ Ci = i?" then 0jgj p{Ci) = S'". For our question we 
therefore obtain (cf. [11, 5.2.5]): 

Proposition 3. The restriction of the lattice mapping p : L{rK^) — > L(sS") 
to the set of all direct summands of rR^ is injective. 

We summarize what we have seen in terms of coding theory. From [8] recall 
a splitting code to be a linear code having a complement in its ambient module. 

Theorem 4. Let p : L{rR^) — > L{sS’^) denote the lattice mapping induced 
by the ring embedding p : R — > S. 

1. p is join-preserving. Its restriction to the set of all splitting codes is injective. 

2. If the module Sr is flat then p is meet-preserving. 

3. If R is semisimple then p is a lattice embedding. 

Proof. The first two statements result from our initial remarks together with 
Prop. 3 and Lem. 3. For our last claim we just remark that in the semisimple 
case every code is splitting and every i?-module is flat because of projectivity 
(cf. [1, Cor. 17.4]). □ 



3 Matrix Representations of Algebras 

In what follows, let AT be a commutative ring, and i? be a Al-algebra. We are 
interested in unital embeddings R — > Mm{K), for suitable to G IN, and first 
give conditions under which these always exist. 

Proposition 4. A unital K -algebra embedding R — > Mm{K) exists if and only 
if there exists a faithful (left module ) operation of R on AT*" . 

Proof. Suppose there is an embedding a : R — > Mm{K). Then AT™ naturally 
obtains the structure of an i?-module by r • a: = a(r)x for all x G A'™. Obviously 
r ■ AT™ = 0 implies r = 0, which means that R is operating faithfully on A'’". 
If, conversely, AT™ is an A-module, then left-multiplication by elements of R is 
a A'-endomorphism of AT™, which provides a homomorphism a : R — > Mm{K) 
which is injective if and only if rK'^ is a faithful module. □ 
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Observing that rR is always a faithful module it is now clear that if R is 
free of rank m over K, then there is a natural faithful operation of R on iC™. 
According to Prop. 4 this gives rise to an embedding of R in Mm{K). For a 
particular case this embedding is easy to construct. 

Proposition 5. Let K he an Artinian commutative local ring and let f := 
jyiLo ^ K[x] he a monic irreducihle polynomial. Then the mapping j3 : 
K[x] — > Mm{K), g{x) g{X) induces a unital embedding of K[x]/ f into 
Mjn{K) where 





'0 


0 •• 


1 

0 




1 


0 •• 


0 

1 


A := 


0 


1 •• 


■ 0 

1 




_0 


0 •• 


1 -/m-i. 



The matrix X introduced in the last proposition is known as the companion 
matrix of the polynomial /. As we saw, it provides a particular embedding of 
K[x]/ f into the ring Mm{K), and there immediately arises the question for a 
classification of all unital embeddings of K[x]/ f into the ring Mm{K). In case 
of K being a field, this is well-known as a consequence of the following theorem. 
For a short proof see [10, p. 222]. 

Theorem 5. Let K he a field, and S be a simple K -algebra the center of which 
is K. Then every algebra homomorphism from a simple subalgebra R of S into 
R is the restriction of an inner automorphism of S. 

For the context at hand, there is a need for a statement clarifying the situation 
where AT is a local ring. Problems of this type have been treated in [12] which 
contains a proof of the following: 

Theorem 6. Let K he an Artinian commutative local ring, and let f € K[x] be 
a monic irreducihle polynomial of degree m. Then each pair of unital embeddings 
of K[x]/f into the matrix ring Mm{K) are conjugate. 

We finally give four example of unital embeddings based on the previously 
outlined companion matrix approach. These embeddings will be used for new 
code representations in the following section. 

Example 1. 1. Let K := F 2 and R := AT[^] with -I- ^ -I- 1 = 0, i.e. R is 

the 4-element field. The companion matrix approach yields the embedding 
R — > M 2 (AT) generated by 




2. Let K := F 2 and R := AT[^] with -I- ^ -I- 1 = 0 be the 8-element field. We 
will make use of the embedding R — > M^{K) generated by 

'111] To 1 0] Too 1] To 1 O' 

110 = 001 101 001 . 

1 0 oj [1 1 oj [0 1 oj [1 1 0 
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3. For K := F 3 and the 9-element field R := K[^] with ^ -I- 2 = 0 we will 
use the embedding R — > M 2 {K) generated by 



'1 2' 




'1 o' 


-1 


'0 r 




'1 o' 


1 1 




1 2 




1 2 




1 2 



4. Let K := 2Z/^ and consider R := K[^\ with -I- ■C + 1 = Oj i-G- the Galois 
ring GR(4,2). We introduce the embedding R — > M 2 {K) generated by 



' 2 1 ' 




'3 r 


-1 


'0 3' 




'3 r 


1 1 




10 




13 




10 



which will play its role in a new representation of the quaternary Octacode. 



4 Representations for Linear Codes 

In the foregoing sections we dealt with different kinds of ring embeddings and 
clarified the induced correspondences between respective sets of linear codes. 
The goal of this section is to combine these results: Given a iF-algebra R, a 
natural number m, and a unital embedding R — > Mm{K), we study a kind of 
representation of all codes of length n over R by codes of length nm over K 
according to the following diagram. 



[n, fc]-Codes 
over M^{K) 



Sect. 1 

^ 



[nm, fcmj-Codes 
over K 



Sect. 2 





[n, fc]-Codes 
over R 



This may be illustrated next: 



Example 2. Let H{2, 3) be the extended binary Hamming Gode of length 8 and 
dimension 4, and let S denote the ring of (2 x 2)-matrices over ^ 2 - Row manip- 
ulations on the usual parity check matrix for 7i(2, 3) produce a matrix which is 
especially good for a representation of C via the four-element field, namely: 



'1 0 


1 0 


1 0 


1 o' 


0 1 


0 1 


0 1 


0 1 


0 

0 


1 0 


0 1 


1 1 


0 

0 


0 1 


1 1 


1 0 



H := 
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Considering the embedding F4 = -p2[^] — > -^2(^2) as introduced in Ex. 1(1) it 
is obvious that the above matrix comes from an ifi-check matrix given by 

'1111 ■ 

ola^ 

It checks a selfdual [4, 2, 3] code correcting one random error which carries over 
to the image code over M 2 {F 2 ), i.e. the latter code corrects up to one full matrix 
error. This yields more than can be directly seen from the distance properties 
of the code 7f(2,3) that we started with: it proves in an elementary way that 
7i(2,3) — in addition to its single random error correction — is able to correct 
certain adjacent double error patterns. 



As a generalization we obtain: 



Theorem 7. For all odd r G IM the extended binary Hamming code H(2,r) is 
induced by an F^-linear code. 



Optimal Binary Codes of Length 24 The Binary Golay Code 



Pasquier [14] and Wolfmann [16] show, how to construct the binary Golay code 
from the extended self-dual [8,4,5]s Reed-Solomon code over the eight element 
field (cf. also [6, p. 130]). 

If a is a primitive element of Eg satisfying = a -|- 1 then the elements of 
the F2-basis B := {a®, a®} of Fg satisfy the relation ir{xy) = Sxy The image 

of a coordinate-wise binary representation of the Reed-Solomon code at hand 
using B is the [24, 12, 8] Golay-Code. Note that the matrix representation of the 
mapping Fg — > Fg, x ^ ax \s given by 

'1 1 1 ' 

110 , 

1 0 0 



i.e. the matrix in Ex. 1(2). 

Representing all other Fg-Reed-Solomon codes in this way, we obtain binary 
linear codes of length 24 and dimensions 21, 18, 15, 9, 6, 3, respectively. Elemen- 
tary row and column manipulations prove that the resulting codes have the 
parameters [24,21,2], [24,18,4], [24,15,4], [24,9,8], [24,6,8], [24,3,8]. Accord- 
ing to [15] the first 4 of them are optimal. For the [24, 12, 8] Golay code this yields 
a generator matrix for the Wolfmann-description [16], which we write down as 
an illustration. 
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'10010010010010010010010 O' 
010010010010010010010010 
001001001001001001001001 
000100111101011010110001 
000010110001100111101011 
000001100111101011010110 
000100101010001111011110 
000010001111011110100101 
000001111011110100101010 
000100011001101110111010 
000010100011001101110111 
_ 000001101110111010100011 _ 

The Ternary Golay Code 

Goldberg [7] gives a construction of the [12, 6, 6]-Golay code as a ternary image 
of a self-dual code over Fg. We give a slight modification based on the embedding 
in Ex. 1(3). The matrix 

'10 0 u-?' 

010 a ? , 

0 0 1 ^ 1 

is a check matrix for a kind of [6,3,4] (hermitian self-dual) hexacode over Fg. 
Its image under the given embedding produces the ternary (6 x 12)-matrix 



which by elementary row and column manipulations can be seen to be a check 
matrix for the extended ternary [12,6,6] Golay Gode. 

The Octacode 

In Ex. 1(4) we introduced a unital embedding of the Galois ring GR(4,2) into 
M2{2Z4). It induces the (2 x 4)-check matrix 

10 e 

0 

for a self-dual code over GR(4,2) (a kind of tetracode) being mapped to the 
(4 X 8)-matrix 

'1 0 0 0 2 1 1 3' 

01001132 

■ourdrsys ■ 

00013233 



100000101221 

010000011122 

001000121012 

000100110111 

000010211210 

000001221101 
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Up to equivalence the latter matrix can be seen to be a check matrix for the 
Octacode. 
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Abstract. In [7] Rains has shown that for any linear code C over TZa^ 
(1h, the minimum Hammimg distance of C and <1l, the minimum Lee dis- 
tance of C satisfy dn > [^ 1 - C is said to be of type a{0) if dn = [^1 
{dH > r^D- In this paper we define Simplex codes of type a and d, 
namely, S'^ and Sf , respectively over TZa- Some fundamental properties 
like 2-dimension, Hamming and Lee weight distributions, weight hierar- 
chy etc. are determined for these codes. It is shown that binary images of 
Sjl and Sf by the Gray map give rise to some interesting binary codes. 



1 Introduction 

The key motivation for studying codes over ^ 4 , the ring of integers modulo 4 
is that they can be used to obtain desirable types of good binary codes. Such 
codes have been studied recently in connection with the construction of lattices, 
sequences with low correlation and in a variety of other contexts [3]-[5],[10], [11]. 
Many good nonlinear binary codes of high minimum distances have a simple 
description as a linear codes over ZZa- Being a linear code decoding becomes 
simplified. 

A linear code C, of length n, over TZa is a submodule of ZZ\. The minimum 
Hamming distance du of C is given by dn = min{wH{x — y) : x,y G C,x ^ 
y}, where wh { x ) is the number of nonzero components in x. It is widely used 
for error correction/detection capabilities. Another distance which is not that 
widely used is the Lee distance. Lee weight of an element a G ^ 4 , denoted 
wl ( o ,) is the minimum of {a, 4 — a}. Lee weight of a vector x G ^4 is the 
sum of Lee weights of its components and the minimum Lee distance of C is 
dp = min{wL{x - y) ■. x,y G C,x ^ y}. 

In [7] Rains has shown that for any linear code C over ^ 4 , 

C is said to be a code of type a{j3) if dn = [^1 {dn > T^D- Note that the 
Octacode is of type j3 while Reed Muller code of first order is of type a. The 
Gray map (j) ■ ^4 ^ ^ 2 " is the coordinate wise extension of the function from 
^4 to ^2 defined by 0 ^ (0,0), 1 ^ (0,1), 2 ^ (1,1) and 3 ^ (1,0). Thus 
4>{C)^ the image of a linear code C over 2Za of length n by the Gray map is a 
binary code of length 2n. If <f){C) is linear then C is called Z 2 -linear. Some well 
known binary nonlinear codes are images by the Gray map of linear codes over 
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^4. Dual code of C is defined in the natural way. In [5] Hammons et al has 
shown that Kerdock and Preparata codes are dual over and hence explains 
the duality relation between these nonlinear binary families. 

In this correspondence we define ^4- Simplex codes of type a &/3, namely, 
and S'f and study the images of such codes under the gray map in two 
ways. In [9] Satyanarayana has shown that the Lee weight of every nonzero 
codeword in is 2^*. A characterization of such constant Lee weight codes 
has been obtained by Carlet [2] . We determine the fundamental parameters like 
2-dimension, Hamming and Lee weight distributions and weight hierarchy for 
these codes. It is shown that both and satisfy chain conditions. We also 
obtain some nonlinear and linear binary codes associated with these codes. 

Section 2 contains some preliminaries and notations. Definitions and basic 
parameters of ZH4-simplex codes of type a and /3 are given in section 3. Section 
4 deals with binary images of Simplex codes of type a and j3 under the gray 
map and some fundamental properties. 



2 Preliminaries and Notations 



A linear code C over ZZ^ has a generator matrix G ( rows of G generate C) of 
the form 



Iko -\- 2B2 

0 2Ik, 2C 



( 1 ) 



where A,B\,B 2 and C are matrices with entries 0 or 1 and Ik is the identity 
matrix of order k. Two codes are said to be equivalent if one can be obtained 
from the other by rearrangements of columns or by multiplying one or more 
coordinates by a unit in 

For each a G ^4 let d be the reduction of a modulo 2 then the code = 
{(ci, C2 , . . . , Cn) : (ci, C2 , . . . , c„) G C} is a binary linear code called the residue 
code of C. Another binary linear code associated with C is the torsion code 
which is defined by 

= { ^ : c = (ci , . . . , c„) gC and Ci = 0 (mod 2) for 1 < i < n}. 

If fci = 0 then For details and further references see [3], [8]. 

A vector is a 2-linear combination of the vectors vi,V 2 , - ■ ■ , Ufe if w = hvi + 

• • • + IkVk with li G ZZ 2 for 1 < z < fc. A subset S = {zzi, V2, ■■■,Vk} of C is called 
a 2-basis for C if for each i = 1,2, ...,k — 1, 2vi is a 2— linear combination of 
Vi+i, ...,Vk, 2vk = 0, C is the 2-linear span of S and S is 2-linearly independent 
[13]. The number of elements in a 2-basis for C is called the 2-dimension of C. It 
is easy to verify that the rows of the matrix 



B = 



Iko ^ Bi -\- 2B2 
2Iko 2A 2Bi 

0 2Ik, 2G 



(2) 



form a 2-basis for the code C generated by G given in (1). The following lemma 
will be needed in Section 4. 
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Lemma 1. The images of the rows of B under the gray map (f> are linearly 
independent over ^ 2 - 

Proof. Applying the gray map 4> to the rows of B with a suitable rearrangement 
of rows yields a binary matrix of the type 



'01 

11 












0 












0 


0 


01 








0 


0 


11 








0 


0 


0 


11 






0 


0 


0 


0 






0 


0 


0 


0 


0 


11 _ 



and fci blocks of [l l] . It is easy to see that the rows 

of the the above matrix are linearly independent. 

A linear code C over 2Z/^ ( over ^ 2 ) of length n, 2-dimension fc, minimum 
Hamming distance dn and minimum Lee distance dr is called an [n, A:, d//, c^l] 
{[n,k,dH]) or simply an [n, fc] code. A necessary and sufficient condition for 
^ 2 -liiiearity is given by the following theorem. 

Theorem 1. [5] C is Z 2 -linear if and only if whenever c = (ci, . . . , c„), c' = 
(c'l, . . . , c'„) G C, 2c * c' = (2c“ici', . . . , 2c“„c“„') G C. 

Thus, if C is an [n. A:, d//, d^] .^ 2 -hnear code over 2Zi^ then 4>(C) is a binary 
linear [2n, A:,di] code. Hence by the Griesmer bound for binary linear codes [6] 
we have 



with ko blocks of 



0 1 
1 1 



( 3 ) 

i=0 

Note that the Octacode Og meets the bound given by (3) even though it is 
not ^ 2 -hnear. For 1 < r < A:, the r-th Generalized Hamming weight of C is 
defined by 



dr{C) = min{rcs(IIr-) : Dr is an [n,r] subcode of C}, 

where ws{D), called support size of D, is the number of coordinates in which 
some codeword of D has a nonzero entry. The set {di(C), d 2 (C), . . . ,dfc(C)} is 
called the weight hierarchy of C. In [15] Yang et al has obtained a lemma con- 
necting Lee weights to the support size of a subcode. Thus for any linear code 
C over dr{C) may also be defined by 

dr{C) = ^min{ E WL{d) : Dr is an r — dimensional subcode of C} 

deDr 
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The two definitions are equivalent. C is said to satisfy the chain condition if there 
exists a chain 



DiCD 2 C...CDk, 

of subcodes of C satisfying ws{Dr) = dr{C), 1 <r < k. 

In [14] Yang and Helleseth has shown that the Goethals code over Z 4 of 
length 8 satisfies chain condition. A relation between dr{C) and dr{C^) is given 
by the following theorem. 

Theorem 2. [1] Let C he an [n, fc] linear code over TZ^ Then 
{dr{C) : 1 < r < fc} = {1, 1, 2, 2, . . . ,n, n}\{n + 1 — dr{C^) : 1 < r < 2n — k}. 

3 .S^ 4 -Simplex Codes of Type a and f3 

Let Gfc be a fc X 2^^ matrix over Z 4 consisting of distinct columns. Inductively, 
Gk may be written as 



'00---0 




22 • • • 2 


33 • • •3' 


. Gk-i 


Gk-i 


Gk-i 


Gk-i 



with Gi =[0123]. The code generated by Gk has been visited earlier [9,2]. In 
[9] Satyanarayana has shown that the Lee weight of every nonzero codeword of 
S'^ is 2^*. While Car let, in [2] has classified all constant Lee weight codes over 
7 Z 4 . Clearly, the 2-dimension of is 2k. The following observations are useful 
in determining Hamming (Lee) weight distributions of 

Remark 1. If Ak-i denotes an array of codewords in S'^_^ and if i = (z, z, z, ..., z) 
then an array of all codewords of 5^ is given by 

Ak-i Ak-i Ak-i Ak-i 
Ak-i 1 + Ak-i 2 -|- Ak-i 3 -|- Ak-i 
Ak—i 2 -|- Ak—i Ak—i 2 - 1 - Ak—i 
Ak-i 3 -l- Ak-i 2 -|- Ak-i 1 -l- Ak-i 



Remark 2. If Ri, R 2 , ■■■, Rk denote the rows of the matrix Gk then wniRi) = 
3 • 2^f^-^,WH{2R^) = 22'=-! and WL{R^) = 2^^ = WL{2R^). 



It may be observed that each element of ZZ 4 occurs equally often in every 
row of Gfc. In fact we have the following lemma. 



Lemma 2. Let c € S^. Lf one of the coordinates of c is a unit then every element 
of Z 4 occurs 4^“^ times as a coordinate of c. Otherwise wh{c) = 2‘^^~^. 
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Proof. By Remark 1, any x G S^_i gives rise to the following four codewords of 

j/i = (a;|a;|x|x), j /2 = (a;|l + a;|2 + a;|3 + a;), 

j /3 = ( a;|2 + a;|a;|2 + a; ) and 7/4 = (x|3 + x|2 + x|l + x). 

Hence by induction, the assertion follows. 

Let G(S'fc) ( columns consist of all nonzero binary fc-tuples) be a generator 
matrix for an [n, k] binary simplex code Sk- Then the extended binary simplex 

ro 1 

code Sk is generated by the matrix G{Sk) = • G{Sk) ' I^iductively, 

0 




Lemma 3. The torsion code of is equivalent to the 2^ copies of Sk- 

Proof. Observe that the torsion code of is the set of codewords obtained by 
replacing 2 by 1 in all 2-linear combination of the rows of the matrix 

2Gk 

Now the assertion follows by induction on k and by regrouping the columns 
in (5) according to (4). 

As a consequence of Lemmas 2 and 3 one obtains Hamming and Lee weight 
distributions of S^. 

Theorem 3. The Hamming and Lee weight distribution of are : 

1. Ah{0) = 1, Aff (22'=-!) = 2^= - 1, Ah{5 ■ 22'=-2) = 2'=(2'= - 1), and 

2. Al{0) = 1, Al(22'=) = 22^= - 1. 

where Anfi) (Ai(i)) denotes the number of vectors of Hamming (Lee) 
weight i in S)(. 

Proof. By Lemma 2, each nonzero codeword of has Hamming weight either 
3 • 4^“^ or 22^“^ and Lee weight 22*. Since dimension of the torsion code is 
k, there will be 2* — 1 codewords of the weight 22*“^. Hence the number of 
codewords having weight 3 • 4*“^ will be 4* — 2*. 




Remark 3. 1. is an equidistant code with respect to Lee distance whereas 

Sk is an equidistant binary code with respect to Hamming distance. 

2. is of type a. 
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Let c = (ci, C 2 , . . . , c„) G C and let iOj{c) = \{k : Ck = j}|- Then the correla- 
tion of c G C is defined as 0{c) = (wo(c) — ^ 2 ( 0 )) -I- i(wi(c) — W 3 (c)); where i = 
y/—l [11]. The Symmetrized weight enumerator (swe) of C over Z 4 is given by 
swe{x, y, z) = (c)yWi(c)-t-w 3 (c)^oj 2 (c) ga |.]^g punctured code 

of obtained by deleting the zero coordinate. Then the swe of is 
swe{x, y, z) = x^^^^ + { 2 ^ - l){xz)^^'^-^h(^l + 2'=/'“'^ , 



where n(fc) = 4* — 1 and correlation of any c G is given by 



e{c) = - 1 . 



(6) 



The length of is large as compare to its 2-dimension and increases fast 
with increment in its 2-dimension. But one can always drop some columns from 
Gk in a specific manner to yield good codes over Z 4 in the sense of having 
maximum possible Lee weights for the given length and 2-dimension. 

Let be the k x — 1) matrix defined inductively by 



'1111 


0 


2‘ 


0123 


1 


1 



and for fc > 2 






'll--- 1 


00 • • • 0 


22- ■■2' 


. Gk-i 


Gti 


^k-l \ 



where Gk-i is the generator matrix of Note that Cf is obtained from Gk 

by deleting 2^“^ (2* -I- 1) columns. By induction it is easy to verify that no two 
columns of are multiple of each other. Let be the code generated by Cf . 
Note that is a [2^“^(2^ — 1),2A:] code. To determine Hamming (Lee) weight 
distributions of we first make few observations. 



Remark 4- If (Bk-i) denotes an array of codewords in and if 

i = {i,i, ■ ■ ■ ) *) then an array of all codewords of is given by 



Ak-i Bk-i Bk-i 

1 + ^fc-i Bk-i 2 -|- Bk-i 

2 + Ak-i Bk-i Bk-i 

3 4 - Ak-i Bk-i 2 -|- Bk-i 



Remark 5. Each row of G^ has Hamming weight 2* ^[3(2^ — 1) -I- 1] and Lee 
weight 2'=-! (2'= - 1). 



Proposition 1. Each row of Gf contains 2 ^G 1 ) units and 
wo = a;2 = 2'=-2(2'=-i-l). 
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Proof. Clearly, assertion holds for the first row. Assume that the result holds 
for each row of G^_^. Then the number of units in each row of is 

By Lemma 2, the number of units in any row of Gk-i is Hence the total 

number of units in any row of will be 2^^“^ + 2 • 2 ^^^“^) = A similar 

argument holds for the number of O’s and 2’s. 



Infact, similar to S^, we have the following lemma. 



Lemma 4. Let c € S^. If one of the coordinates of c is a unit then wi + W 3 = 
22 (fc-i) ijjQ = — 1). Otherwise wq = 2*“^(2*“^ — 1 ),W 2 = 

22(fc-i) = = 0. 



Proof. Let x G S'f then, by Remark 4, there exist yi G and j /2 G -S'f _2 such 
that X can have any of the following four forms: 

X = (yi\y 2 \y 2 ), x = {1 + yi\y 2\2 + y 2 ) 

X = (2 + J/1I2/2I2/2) or X = (3 + j/i|j/2|2 + j/2) 
duction and Lemma 2, 



Now the assertion follows by in- 



The proof of the following lemma, being similar to the proof of Lemma 3, is 
omitted. 

Lemma 5. The torsion code of is equivalent to the 2^~^ copies of the binary 
simplex code Sk. 

The proof of the following theorem, being similar to the proof of Theorem 3 is 
omitted. 

Theorem 4. The Hamming and Lee weight distributions of are : 

1. A//(0) = 1 , A//(22'=-2) = 2^ - 1,Ajj(2'=-3[3(2'= - 1 ) + 1]) = 2'=(2'= - 1) ,and 

2 . Al(0) = 1, Al(22'=-1) = 2'= - 1, Al(2'=-1(2'= - 1)) = 2'=(2'= - 1). 



Remark 6. is of type /?. 

2. The correlation of each nonzero codeword of St with components O’s or 2’s 
is - 2 '^-b 

3. The swe of is given as 

swe{x, y, z) = x"<'=) + ( 2 '= - 
where n{k) = 2^~^{2^ — 1 ). 



4 Gray Image Families 

Let C be an [n, fc, dn, dr] linear code over Z 4 . Then 4>{C) is a, binary code having 
2* codewords of length 2n, and minimum Hamming distance dr. However 4>{C) 
need not be linear. Let B be the matrix ( given in (2)) whose rows form a 2-basis 
for C and let 4>{B) be the matrix obtained from B by applying the gray map to 
each entry of B. The code generated by 4>{B) is a [2n, k,> \ ^]] binary linear 

code. Note that 4>(C) and have same number of codewords but they are not 

equal in general. The following proposition shows that both 4>{S^) and 4>{S^) 
are not linear. 




On ^ 4 -Simplex Codes and Their Gray Images 177 



Proposition 2. and 4>{S^) are nonlinear for all k. 

Proof. Let i?i, i? 2 , • • ■ -Rfc be the rows of the generator matrix Gk (G^). Let 
c= Rk {Ri) and let c' = Rk-i (Rk) then by (6) ( Remark 6.2 ), 2c*c' ^ ('S'f )• 

Hence, by Theorem 1, the result follows. 

Remark 7. 1. is a binary nonlinear code of length 2^^+^ — 2 and minimum 

Hamming distance 2^*. It meets the Plotkin bound [6] and n < 2dH- 

2. is a binary nonlinear code of length 2^(2* — 1) and minimum Hamming 
distance 2^“^ (2^ — 1). This is an example of a code having n = 2d/f[6]. 

3. Even though, both S'^ and S'f are not ^ 2 — linear they meet the bound 
given by (3). 

The next two results are about the binary linear codes obtained from S'^ and 

sl 

Theorem 5. Let C = S'^. Then is an — 2,2fc,2^*] binary linear code 

consisting of two copies of binary simplex code 82 k with Hamming weight distri- 
bution same as the Lee weight distribution of Sf: . 

Proof. By Lemma 1, is a binary linear code of length 2^^+^ — 2 and dimension 
2k. Let Qk be a generator matrix of Sf: in 2-basis form. Then 



■ 0 . 


.0 


1 . 


.1 


2 . 


.2 


3. 


.3 ■ 


0 . 


.0 


2 . 


.2 


0 . 


.0 


2 . 


.2 


Gk-1 


Gk-i 


Gk-i 


Gk-i 


_2Gk-i 


2Gk-i 


2Gk-i 


1 

7 

CM 



and 



'0000... 00 


0101. 


.01 


nil. 


.11 


1010. 


.10' 


0000... 00 


1111. 


.11 


0000 . 


.00 


1111. 


.11 


(j){Qk-i) 


4>{Gk-i) 


4>{Gk-i) 


<f{5k-i) J 



The proof is by induction on k. Assume that (j>{Qk-i) yields a [2^^“^,2(A: — 
1), 2^^“^] binary code in which every nonzero codeword is of weight 2^*“^. Then 
the possible nonzero weight from the lower portion of the above matrix will 
be 4 • 2^*“^ = 2^^. From the structure of first two rows of 4>{Gk), it is easy to 
verify that any linear combination of these rows with other rows has weight 2^^. 
Puncturing the first two columns and rearranging the columns yields the code 
having two copies of 82 k- 

Theorem 6. Let C = 8^. Then is the binary MacDonald code 

M2k,k ■■ - 2^ 2k, 2^'=-^ - 2'=-!] 

with Hamming weight distribution same as the Lee weight distribution of 8^. 
Proof. It follows by induction and is similar to the proof of Theorem 5. 
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The weight hierarchy of and are given by the following two theorems. 

Theorem 7 . satisfies the chain condition and its weight hierarchy is given 
by 

r 

dr(S^) = 22'=-* = 2^'= - 22'=-" ; 1 < r < 2fc. 

Proof. By Remark 3 , Any r-dimensional subcode of is of constant Lee weight. 
Hence by definition, 

dr(S^) = ^( 2 " - 1 ) 22 '= ^ 22 * _ 22 '=-". 

Let D\ =< 2 i?i >, Z?2 2i?i,2i?2 Dz , 2 i?i , 2i?2 Da 

i?i , 2 i?i , i?2 , 2i?2 >, • • • , D^k =< i?i, 2 i?i, . . . , Rk, 2 Rk > . It is easy to verify 
that 

DiCD^C-.-C D2 k, 
and ws{Dr) = dr(S^) for 1 < r < 2 k. 



Theorem 8. satisfies the chain condition and its weight hierarchy is given 
by 

dr{Sl) = n{k) - 2'=-"-i(2'= - 2 ^ 51 ) 1 < r < 2fc. 

where n{k) = 2'=-^(2'= — 1). 

Proof. The proof follows by induction on k. Clearly the result holds for /c = 2 . 
Assume that the result holds for Hence if 1 < r < 2 fc — 2 then there 

exists an r-dimensional subcode of with minimum support size n{k — 1) — 
2 '=-"- 2 ( 2 '=-i - 2 ^ 51 ). By Remark 4 , 

) = 2 dr(SY) + driS^-i) ( 7 ) 

But all r-dimensional subcodes of S^_i have constant support size (22*~2 _ 
22k-2-ry xhus simplifying ( 7 ) yields the result. For r = 2 fc — 1 and 2 k the result 
can be easily proved. Let Di =< 2 Ri >, D2 =< Ri, 2 Ri >, 
ZI3 =< i?i, 2i?i, 2i?2 >) D4 =< i?i, 2i?i , i?2j 2i?2 j 

=< Ri, 2 Ri , ... ,Rk, 2 Rk > . Then it is easy to see that 



DiCD 2C.-- CD2k 



is the required chain of subcodes. 

The dual code of is a code of length 22 '= and 2 -dimension 22 '=+^ — 2 k, 
whereas the dual code of is a code of length 2'=- '^(2'= — 1) and 2-dimension 
22 '= — 2'= — 2 k. The Hamming and Lee weight distributions of these dual codes can 
be obtained with the help of Theorems 3 and 4 and the MacWilliams Identies 
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[8]. Similarly, the weight hierarchies of duals can be obtained from Theorems 
2,7 & 8. 

In [12] Sun and Leib have considered the dual of {k > 3) in a different 
context. They have used combinatorial arguments to obtain a code of length 
n = 2’’“^ (2'’ — 1), redundancy r and minimum squared noncoherent weight 
N + 1 — yj {N — 2)2 + 9, where N = n — 1. They have further punctured these 
codes to get some good codes in the sense of having larger coding gains over 
noncoherent detection. 

The results proved in sections 3 and 4 are easily extendible to simplex codes 
over ^20 by suitably modifying the definition of the Gray map. 
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Abstract. We consider generalized concatenation of block codes. First 
we give a short introduction on the notion for concatenated and error- 
locating codes. Then an estimation of the hard decision error correcting 
capacity of concatenated codes beyond half the minimum distance is 
presented. 

1 Introduction 

Code concatenation is a useful method for constructing long codes from shorter 
ones. It is possible to decode these concatenated codes via their component 
codes, leading to a reduced decoding complexity. Moreover, many errors with 
weight exceeding half the minimum distance of the code can be corrected. 

The concatenation of block codes (CC codes) was introduced by Forney [3]. 
Blokh and Zyablov enhanced this definition resulting in so called generalized 
concatenation (GCC) that includes Forney’s approach as a special case. Later 
Zinov’ev [7] modified the definition of generalized concatenation in order to in- 
clude nonlinear component codes. 

The principle of error-locating codes (EL codes) was first described by Wolf 
[6,5]. In [9,1] Zyablov investigated this class of codes and generalized the con- 
struction in a similar way as it was done for concatenated schemes. This led to 
so called generalized error-locating codes (GEL codes) . 

GEL codes are a subclass of the GGG code class given by the definition in 
[7]. A formal proof of this can be found in [10,11]. For a detailed introduction 
on generalized concatenation see [2]. 

Usually the error correcting and detecting capabilities of a block code are 
solely characterized by the minimum Hamming distance. This is justified by us- 
ing an (algebraic) decoding algorithm that can decode any error pattern with 
a weight up to half the minimum distance and will usually fail if an error of 
higher weight occurs. However, if we use a decoding algorithm that allows the 
correction of error patterns with weight beyond half the minimum distance this 

* This work was supported by Deutsche Forschungsgemeinschaft (DFG), Germany. 



Marc Fossorier et al. (Eds.): AAECC-13, LNCS 1719, pp. 181-190, 1999. 
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should be used for an estimation of the decoding performance. We present a 
modified approach for the classification of the correcting capability of concate- 
nated schemes that leads to tighter bounds on the bit error and block error rate 
of the generalized concatenated code. 

2 Code Definitions 

A code C of length n, dimension k and minimum Hamming distance d with 
symbols from GF(g) will be denoted by C{n,k,d)q. For the binary case the 
symbol q may be omitted. Let GF((j’”) be an extension field of order g™, q is 
a prime power. An element a G GF(g™) in exponential representation will be 
denoted by a if it is given as a vector consisting of m coefficients of its polynomial 
representation. Gonsequently, a vector a G GF(g’")” can be written as an m x n 
matrix a with entries from GF(g). 



2.1 Code Partitioning 

Let ,8 be a linear code with codewords b G B. A ^-way partition of B is con- 
structed from ^ disjoint subcodes 8^, a = 1 ,... ,/r, for IJo'^o = We will 
denote such a partition by B/B' . /i is called the order of the partition and a is 
a label identifying a unique subset B'^. 

If a set is partitioned L times by repeated partitioning of the subsets we get 
an L-level partition chain B^^~^ / • • • /B^^\ Subsets in such a partition chain 

are labeled by a multi-part label, e.g. the label {a^^\ a^'^\ . . . denotes a 

subset in the last partition level of an L-level partitioning. The subcode that 
contains the all zero vector (linear subcode) will be labeled with a ‘O’. d{B) 
denotes the minimum Hamming distance of a code. If a subcode 8*-*^ consists of 
only one single codeword we define d{B^^'^) = oo. 

A linear subcode of the code B^^^> induces a partition of B^^'' into 

cosets. A coset is given by adding any element of this coset, say b, to the linear 
subcode + b, b ^ Bq~^^\ Therefore b is called coset 

representative of the subcode with label The union of all cosets adds to 

q(i). q{i) — 8q -I- b, where \B^^^ is a set of all coset 

representatives. There are sets of coset representatives that form a linear code 
and can be described by a generator matrix Gg(o Gg(o /g(i+i) describes 
the mapping of the label to a coset representative of level 1. Hence it is 

possible to calculate a code word of the subcode b G B^^}) '■ 



b — Gg(i)yg(j;, + 1) • 


/a<«\ 


, where Gg(i)/g(i,+i) — 


( Gq(L) /g(i + l)\ 




U‘‘V 




\ Gg(l)/g{2) / 



Gg(i) /g(i.+i) is the combination of all generator matrices of coset representatives. 
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As the mapping done by the generator matrices is unique, there exists an 
inverse mapping which produces the label of a partition at level I if a coset 
element is given. The matrix describing this mapping is a (partial) parity check 
matrix of the linear subcode and will be denoted by iTg(i)/g(i+i) • The 

combination of all the matrices iTgco/gCi+i) , I = 1,... ,L, defines the linear 
mapping from a given coset to the label: 





: 1 


, where = 


/ Hg(L)/IS(L + l)\ 








V H ) 



A code B{nb, m, l)q consisting of all possible n^-tuples will be denoted by GF(g)”'’ 
and the single codeword code B{rib,0,oo) by {0}. Note that Gg(i)/{o} = Ggm 
and iTQp(q)nfc/g(!) = 

2.2 Generalized Concatenated Codes 

The definition of GCC codes is based on the description given by Zinov’ev in [7] 
and is not restricted to linear component codes. 

Definition 1 (GCC Code). An L-level generalized concatenated code ( GCC 
code) C consists of L outer codes A^’‘\na,ka\da'^) and an inner code 

B'^'^\nb,kb^\di^'’)q that is partitioned into a L-level partition chain 
B'^'^\nb,kb^\di^'’)q/ ■ ■ ■ = oo)q = b. The sub- 
codes of level L + 1 consist of a single codeword only. The symbol of ^ of the 
Ith outer code determines a subset of B^l ^( 2 ) Thus the symbols 

/ = !,... ,L, of all outer codes together form the label .. . of a unique 

codeword bi. The codeword of the GCC code consists of all the codewords bi, 
i = l,... ,Ua. 

The construction principle of GGG codes is illustrated in figure 1. The GGG 
code has length Uc = Ua ■ nb, dimension kc = ka^ • ma \ and a minimum 

Hamming distance dc > mini=i,,,, ^ (see [1,4]). We will assume that 

= 0, for alU = 1, . . . ,L. If L = 1 the GGG code 
reduces to an ordinary concatenated code. 

Encoding can be done by first encoding ka^ information symbols into a code- 
word adl ior I = 1, . . . ,L and then mapping the label (a^^\ . . . , a^^^) on to the 
codeword b. This encoding scheme, however, will not be systematic. 

Example 1 (GCC Code). We construct a two-level GGG code using a partition 
chain of the inner binary Hamming code 4, 3)/,8^^)(7, 3, 4)/,B(^)(7, 0, 00 ) 

and two outer codes A^^^(7, 1 , 7)2 (a repetition code) and A^^^(7, 4, 4 ) 23 . The 
parameters of the generalized concatenated code construction are Uc = 49, kc = 
k^^ ■ -\- k^a'^ ■ = 1 -I- 12 = 13 and dc > min{7 • 3, 4 • 4} = 16. 
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L outer codes 
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columnwise mapping defined 
by a L-level partition chain 
of code 




Fig. 1. Construction principle of generalized concatenated codes. 



2.3 Generalized Error-Locating Codes 

Error-locating codes can be defined by the parity check matrices of the compo- 
nent codes [5]. 

Definition 2 (GEL Code). An L-level generalized error-locating code (GEL 
code) consists of L outer codes A^’‘\na,ka\da'^) ^o) , I = 0, L — 1, and L 

inner codes ,L. Let 

/g(i) 

V 

he a parity check matrix of the inner code and a codeword of the Ith 
outer code A^''\ Each codeword C of a GEL code in matrix form fulfills 

aW =Lfg(o/B(,+i) -C, (1) 

for all I = 0, .. . ,L — 1. 

The defining equation (eq. 1) of a GEL code is equivalent to the calculation of 
the syndrome. Therefore the codevector is also known as syndrome. Notice 
that the columns of the codeword matrix C are in general not codewords of B^^'^ 
or one of its subcodes. 

The redundancies of outer and inner codes are given by ra^ = Ua — ka'^ and 
= Ub — respectively. We require that ma'^ = k^^^ = Ub, 

for all I = 0, ... ,L — 1. The GEL code has length Uc = UaUb, redundancy 
rc = ra'^m^a and dimension kc = nak[^"^ + ka^m^a = ric - re- For 

L = 1 the GEL code reduces to an ordinary EL coding scheme. 
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Fig. 2. Systematic encoding of generalized error-locating codes. 



We consider systematic encoding. The encoders for the outer codes should 
be systematic. Further we assume that the matrices iJg(i-i) /g(o are given by 

( iTg(i-i)/g(i) 

Hq{L) = . 

/g(2) 

V 

where I^m is a x mo^ identity matrix. Note that any H^(l) describing the 
partition chain and its labeling of the inner code, can be transformed into such 
a representation without destroying the nested partition. This is possible if row 
operations within iTg(i_i)/g(!) and column permutations of ffg(t) are allowed. 
Of course, permuting the columns of iTg(i) will lead to an equivalent code, but 
without changing the properties of the concatenated code. The matrix ffg(L) 
cannot be made systematic while preserving the partition structure. Therefore, 
there is still a nonsystematic mapping that combines outer and inner codes. 

Figure 2 illustrates the encoding process. After filling the white part of the 
codeword matrix C with information symbols the information part of every 
outer codeword can be calculated using equation 1. Note that this can be done 
without knowing any of the redundancy matrices in C denoted by Then 
the redundancy part of the outer codewords, 1 = 0,... ,L — 1, results from 
ordinary systematic encoding. In the last step the matrices will be calcu- 
lated. If we denote the upper right submatrix by (see figure 2) then the 
following equation holds for alH = 0, . . . ,L — 1 





p{i) = ^(0 _ 
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As is part of that is needed for the calculation of P*-*\ the recursive 

calculation of the matrices P*-^^ starts with I = L — 1. 



Example 2 (GEL Code). A two- level GEL code is constructed using the in- 
ner codes and and the outer codes 1,7)2 and 

A^^^(7,4,4)23. a sample parity check matrix Hg( 2 ) is 



iJg(2) 





/I 0 1 1 0 0 0\ 
1110100 


\ ) 


1100010 




1^1 1 1 1 1 1 1/ 



(2) 

The resulting concatenated code has length rir = 49 and dimension kr = nnkl - 



Description of the Decoding Principle: A codeword C of the generalized con- 
catenated code, given in matrix form, will be corrupted by some error pattern 
E € after transmitting it over a noisy channel: C = C + E. Decod- 

ing of generalized error-locating codes can be done by means of a multi-stage 
decoding algorithm. Decoding starts with calculating the first syndrome part 
= iTg(o)/g(i)C. Then is decoded into providing the labels of the 
cosets needed to do decoding of 5j, j = 1, . . . , Ua, in some (usually) nonlinear 
subspace of B^^f In the second stage of the decoding first an additional part of 
the syndrome (a^^^) is calculated. Then decoding of is performed and so 
on. The information of corrupted columns of the code matrix resulting from the 
previous decoding step can be used to erase some of the symbols of This 
decoding principle gave rise to the term error-locating codes. If decoding of the 
outer codes fails for any step then decoding of the concatenated code will fail, 
too. 



2.4 The Connection between GCC and GEL Codes 



A formal proof of this result can be found in [10,11]. 

Both code concatenation descriptions are based on the same partition prin- 
ciple of the inner code. The main difference is the labeling of the partition. For 
the case of GEL codes, the parity check matrix H^(d is specified, whereas for 
GGG codes the labeling is not defined. If labeling of the subcodes is done by a 
linear mapping and consequently can be described via a matrix Gg(i) then both 
descriptions are identical: 



• S = C 



5 = • C, 



provided that Gg(i) = iTg(\,). Since Hq{l) has full rank, it is always possible to 
find such a Gg(i) . 

Therefore both constructions can be used to describe the same code: A linear 
L-level GGG code can be described as a special case of an (L -I- l)-level GEL 
code and an L-level GEL code is a special case of an (L -|- l)-level GGG code. 
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L-level GEL — > {L+l)-level GCC: An L-level GEL code with inner component 
codes I = 1 ... ,L, and outer component codes 

ria\da'^) ( 1 ) , / = 0 . . . , L — 1, is a special case of an {L + l)-level GCC 

q^a 

code with the additional codes B^^\nb,nb, l)g and A^^\na,ria, 1) ,(i) • Note that 

q *> 

indexing of the component codes has only been shifted by 1 compared to the 
original definition of GCC codes. This description reveals the minimum distance 
of the code: dc > min/=o.... ,i{da 

Example 3. The two-level GEL code from example 2 shall be interpreted as a 
three-level GCC code. For the GCC description we need two more component 
codes: 7, 1) and 7, 1 ) 23 . The minimum distance of the code is dc > 

min{di°^ • 1 • = 4. 

L-level GGG — > {L-\- 1) -level GEL: An L-level GCC code with inner and outer 
component codes B^'‘\nb, n^b'’ ^d^p)q and A^'"\na,ria\da'^) ^(o , / = 1 . . . , L, is a 
special case of an (L -I- l)-level GEL code with the additional codes 
0 , oo)q and 0, 00 ) (i) . 

q'b 

Example 4- An interpretation of the two-level GCC code from example 1 with 
^(i)(7,4,3), B^^\7,3A), 1 , 7 ) 2 , and A(2)(7,4,4)23 as a three-level GEL 

code requires the additional component codes 0, 00 ) and A*-°^(7, 0, 00 ) 23 . 

Since an L-level GEL code is only a special case of an (L -|- l)-level GCC 
code, decoding algorithms for GCC codes can be used for GEL codes, too. Thus 
an algorithm of Blokh and Zyablov (see e.g. [8]) can be used to decode all 
error patterns whose weight does not exceed . However, this algorithm can 
correct many error patterns of higher weight, too. In the following section we 
will present a method for estimating the bit error rate that will take into account 
these additional correcting capabilities. 

3 On the Hard Decision Correcting Capability of 
Generalized Concatenated Codes 

If an algorithm is used for decoding that can correct all error patterns with 
a weight up to half the minimum distance of the code and in addition many 
patterns of higher weight, then the minimum distance is not a sufficient descrip- 
tion of the correcting capability of the code. This is the case for the decoding 
of generalized concatenated codes with the algorithm of Blokh and Zyablov [8] . 
Therefore we present bounds for the bit error and the block error rate that take 
into account this additional error correcting capacity. Transmission over a bi- 
nary symmetric channel (BSC) is assumed. The bounds depend on the decoding 
algorithm on which the estimation of the number of errors after decoding is 
based. 
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Estimation of Bit Error Rate: Let s = be the number of errors a bounded 

minimum distance decoder can correct. We will state an upper and a lower bound 
for the bit error rate of the code bits of a generalized concatenated code that 
can correct at least a part of all error patterns up to the weight t > s. For a 
good estimate, t should be chosen such that it includes most of the additional 
correctable error patterns. However, since t is also the maximum number of 
errors that can be added to a code vector in case of wrong correction, t should 
not be too large. Moreover the effort for the calculation of the bounds increases 
very fast with increasing t. One has to find a good compromise. 

The codewords C in matrix form consist of Ua columns of length ns, resulting 
in a total code length of Uc = Ua • n^. Let e = (ei,C 2 ,... ,e„,,) G S he the 
distribution of the column weights of an error pattern, i.e. Ci is the number of 
columns containing i errors and £ is the set of all possible error distributions. 
|e| = denotes the total number of errors. Now it is possible to split the 

bit error rate in a part that is due to all the errors of weight < t, and a second 
part that is due to the remaining errors with weight > t: 



Pb%t{t) = p{\e\ <t)+ p(|e| > f). 



( 2 ) 



We assume that the decoding algorithm will not add more errors than it is able 
to correct. Therefore we can upper bound the second term (p is the crossover 
probability of a BSC): 



p(|e| >t)< — Y^ N{e)P{e) <— ^ (t + t) ( '7 ]p\^-pY 



eGS 

|e|>t 






(3) 



N{e) < (i + t) is the maximum number of bit errors that can result, given a 
weight distribution of bit errors e. P{e) = ~ is the probability 

of this weight distribution. The first term of equation 2 is estimated by a sum 
of all different weight distributions of error patterns |e| < t: 



p{\e\ <t)<-J2 me)P{e) = - 

Tic Tic 



V N{e)U{e]Y^ 



where the number U{e) of error patterns of a weight distribution results from the 
number of possibilities for the selection of erroneous columns times the number 
of possibilities to place the errors within these columns: 




E i — 1 
7 = 1 







Ci 
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Fig. 3. A comparison of some bounds on the bit error rate of the concatenated 
code given in example 5. 



The calculation of N has to be done individually for all |e| < t. The correspond- 
ing lower bound Pbit{t) is given by 

Pb^t{t)=p{\e\<t)+p{\e\>t), (4) 

P{\e\ >t)>-J2 N{e)P{e) >-£(*- t) h)p\l ~ pT^~\ 

ee£ i=t+l V * / 

|e|>t 

P{\e\ < <) > - ^ fV(e)C/(e)pl«l(l 
ee£ 

|e|<t 



with N{e) as the minimum number of bit errors that will result from decoding. 
For p — > 0 both bounds will coincide. 

Example 5 (Bounds on the Bit Error Rate). We consider a concatenated code of 
length Uc = 1024 and dimension k = 988 based on a partition chain of the inner 
code S(o)(32,32,1)/^(i)( 32,31,2)/S(2)(32^26,4)/S(3)(32^21,6) and the outer 
codes A(°H32,16,8), ^(^^32,29,4)25 and A(^)(32, 31, 2 ) 35 . As the minimum 
Hamming distance of the code is 6, each error of weight 1 or 2 can be cor- 
rected using the multi-stage decoding algorithm by Blokh and Zyablov. In ad- 
dition almost all of the error patterns of weight 3 — except the pattern given 
by e = (0, 0, 1, 0, . . . , 0) — can be corrected. This error pattern, however, seldom 
occurs as compared to all the correctable errors and can be detected. Thus the 
decoding result is similar to a three-error correcting code. Figure 3 compares 
the bounds of eq. 2 and 4 for t = 3 with upper bounds on two- and three-error 
correcting codes (given by eq. 3, t = 2,3). 
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Estimation of Block Error Rate: The block error rate is upper bounded by the 
error rate of an ordinary t error correcting code and the error rate that is caused 
by error patterns of weight < t (we ignore error patterns of weight > t that can 
be corrected): 



Pblock{t) < Phl^ckit) + Pblock{\e\ < t), 

Pic / \ 

i=t+l k * / 

Pbiock{\e\<t) = E 

e^S 

\e\<t 



N{e) indicates whether a given weight distribution of an error pattern oversteps 
the correcting capacity of the code: 



r 1 if N{e) > 0 
\0 if iV(e) = 0. 
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Abstract. In this paper a bootstrap iterative decoding technique con- 
catenated with the Viterbi algorithm (BIVA) and trellis shaping for 
trellis-coded modulation (TCM) is proposed. The concept of a boot- 
strap decoding is introduced first and then the new metric functions 
which take into account the bootstrap iterative decoding algorithm for 
TCM systems are derived. One and two dimensional bootstrap decoding 
are proposed for packet transmission. Furthermore, the trellis shaping 
technique is also considered to combine with such TCM schemes using 
the BIVA. The simulation results show that the performance of 1.25 dB 
away from Shannon limit can be achieved by the BIVA and trellis shaping 
for 256-state 6 bits/T TCM scheme, with low complexity and reasonable 
computation. 



1 Introduction 

Trellis-Coded Modulation (TCM) has been widely used as a combined coding 
and modulation technique for digital transmission over band-limited channel. 
Ungerboeck has shown that significant coding gains can be achieved using trellis- 
coded modulation with the Viterbi decoding algorithm over uncoded modulation 
without sacrificing bandwidth efficiency on a bandlimited channel [1]. In the past 
decade, many variants on the basic TCM scheme have been developed to obtain 
higher coding gains. 

Bootstrap decoding [4]- [7] is a method which imposes algebraic constraints 
on streams of convolutionally encoded information sequences. Such constraints 
can then be made use to gather extrinsic information from other streams when 
one stream is decoded. In [7] Wei extent the results of [4]- [6] to near optimally 
bootstrap decoding using long convolutional codes. One of the simplified boot- 
strap algorithms, which only uses the Viterbi algorithm, was given in [7] and 
named as BIVA in [8]. 

In this paper, a system of concatenating trellis codes with the bootstrap 
decoding is proposed. We will focus on how to apply the BIVA to TCM, and one 
and two dimensional bootstrap structures are designed. In addition, we noted 
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that the bootstrap iterative decoding algorithm can be modified to accommodate 
trellis shaping [10] and full shaping gains can be achieved. 

The paper is organised as follows. In section 2, the principle of encoding 
system with bootstrap decoding is described. We then present the BIVA which 
combines the bootstrap iterative decoding with Viterbi algorithm in section 3, 
and the new metric functions for bootstrap TCM schemes are also derived. In 
section 4, one and two dimensional bootstrap structures are proposed. In section 
5, concatenating shaping techniques with bootstrap TCM using the BIVA is 
considered. The numerical simulation results and discussion will be reported 
in section 6 to compare the performance of the BIVA with Viterbi algorithm. 
Lastly, in section 7, the conclusion is given. 



2 Review the Bootstrap Structure 

Bootstrap decoding is a method which utilises algebraic constraints across streams 
of convolutional encoded information sequences. Before introducing the BIVA 
and its decoding algorithm, let us look at the basic concept of the bootstrap 
decoding. 

In the bootstrap decoding [4] , a medium size packet can be formed by organ- 
ising (nib — 1) X / information symbols a block of (nib — 1) rows and I columns. 
Suppose [U]^ , = [C/yi, • • • , Ujj] and [V]^- ^ = [V,,i, V,, 2 , • • • , V,-./] denote the 
information symbol vector and the encoded codeword vector of the packet 
stream, respectively. As usual, we will encode each packet of I x k binary in- 
formation bits into codewords of length I through the same k/n convolutional 
encoder and each encoded symbol Vj^i, (i = 1, 2, • • • , I) has n information bits. 

An packet is then generated in such way that the digit will be the 
parity of the digits of the (nib — 1) information packets, namely, the 
packet is a modulo 2 position-by-position sum of the above (mb — 1 ) information 
packets, i.e., 



Urrib.i — Ul,i 0 U2,i 0***0 Umb — 14 ^ ^ — 1 , 2 ,...,/. ( 1 ) 

where 0 denotes a modulo 2 sum. The packet is therefore called the parity- 
check-constraint (PCC) packet. It was shown in [7] that the free distance of 
such block code is doubled. The PCC packet is also encoded by the same 
convolutional encoder. Fig. 1 shows the bootstrap structure of one block of mb 
rows and I columns. Because of the linearity of convolutional encoding^, the 
PCC packet corresponds to a path in the coding tree whose information digits 
are the mod 2 sum of the information digits underlying the information packets; 
i.e., 



— ^ 1 ,* 0 ^2,i 0***0 — i — 1 , 2 ,...,/. (2) 

^ All the convolutional encoders shown in this paper are assumed to be linear which 
guarantees that the PCC condition among the information packets still exists for 
coded symbols after the coding process. 
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packet 1 
packet 2 

packet rrib — 1 
packet mb 

Fig. 1. The bootstrap structure of one block code. 
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Vmi,,2 







Hence, all nib packets are in principle decodable. Suppose next that the nib 
encoded packets are sent through the channel disturbed by AWGN, and that 
the corresponding received signal vectors [R]^- ^ (j = 1, • • • , i = 1, ■■■ ,1) are 
arranged by the decoder into a mb by I array. We can find that if the j-th 
received packet is to be decoded, the received digits of all other packets should 
also be taken into account, since these contain information about the transmitted 
digits of the j-th packet (the transmitted digits are related by the parity-check- 
constraint condition). 



3 Bootstrap Iterative Viterbi Algorithm for TCM 

In this section, we will modify the BIVA [7] [8] for TCM schemes. In [7] Wei made 

(p) 

a significant simplification on the computation of the parity metric AT' for con- 
volutional codes. The modified BIVA is an appropriate to decode the bootstrap 
TCM scheme with low complexity. Now let us consider how to construct boot- 
strap TCM scheme and how to derive its parity metric. 

In trellis-coded modulation, the key concept is the principle of mapping hy set 
partitioning [1]. A general TCM system is shown in Fig. 2. In the trellis encoder, 
the inputed information bits are divided into two parts. One part • • • , C/f) 
is encoded by a convolutional encoder whose output is used to select one subset 

from the whole constellation; the other part ■ ■ ■ ,Uf) comprises uncoded 

bits which are used to determine one signal point within that selected subset. 
The TCM system can be combined with the fore-mentioned bootstrap struc- 
ture. Next we describe how to concatenate the bootstrap iterative decoding 
algorithm with VA for such TCM systems. The key idea of bootstrap decoding 
is to produce one sequence which follows the PCC condition using (nib — 1) infor- 
mation sequences. But for TCM systems, the question arises whether to parity 
check or protect each bit in one symbol? We know that the bit error probabil- 
ity (BER) performance of TCM is mainly determined by the minimum squared 
Euclidean distance which is the minimum of parallel transition’s squared 
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Fig. 2. Schematic representation of a typical trellis code. 



distance d^araiiei coded minimum squared Euclidean distance c> 

d)ree = ^Mdlaraiiei^ dj^ee,c) ■ ^ ^^/ree.c < c^para«e/> ^e Can say that the bit er- 
ror caused by parallel path error can be ignored at a high signal-to-noise rate 
(SNR), and the BER is dominated by d^ree.c- Therefore, the performance should 

not be dramatically affected if only the coded bits C/1 , • • • , C/f are parity checked 

or protected and uncoded bits • • • , C/f are intact or unprotected instead of 

protecting the whole symbol U} ,■ ■ ■ . The benefit of such operation is obvious 

— the information transmission rate is increased. However, for the TCM systems 
in which g > dp^^^j^gj, if only coded bits are protected, the performance will 
be much worse than the one in which the whole symbol is protected. 

Now let us derive the new metric function which take into account the PCC 
condition for TCM. In this paper, we just focus on protecting the coded bits in 
one symbol. Let [R.]^ / = • ’ ' : denote the received signal vector 

of the packet, and (I/L; V^f) = (V^T, . . . , • • • , 14” ) denote the i*'* 

encoded codeword of the packet, where Vj’^ is the output from the convo- 
lutional encoder and is the uncoded part of the input symbol. If only the 
coded bits are protected, then only 14^- satisfies the PCC condition in the en- 
coded codeword. Suppose that packets ji, j 2 , ■ ■ ■ , jmi ,-2 have been successfully 
decoded using the normal VA and now we are going to decode packet jmt-i- 
The original PCC condition on the symbols in one block {mb packets) is 

= = (3) 

Let Wmt- 2 ,i = receiver replaces the (j = 1, 2, . . . , Wf,- 

2) received packets by the estimated transmitted packets. So we can get the 
estimated PCC condition 

VP™,-2.* = k\^.0---©V4_2..- (4) 

where 14L is the decision of Now the new likehood function for decoding 
the {mb — 1)*^ packet is 

A' = 

^mb — 1,2 



log[P(i?i ,2? ' ' ' ) — l,2j Rrajj, (5) 
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where _ 1 , and ^ are coded part and uncoded part of codeword 

respectively. Assuming • -jUmt-iA and Rmt,i are independent of each other^, 
we then have 



AU-id = -iog[p(i?™,-i..K-i.dcr-i.*)] 

— log[P(i?iy, • • • , Rrai,- 2 ,i, Rmb,i\^mb-lX^ Knf-l,i)]’ 






mfe — l,i 



+ A 



ip) 



( 6 ) 



where i denotes the branch metric value which can be obtained using 

the VA, and i denotes the extrinsic metric value introduced by the PCC 

condition from the other packets. 

If Wmb-2,i = Wmb-2,i, then we have 

Wbnb-2,i © = 0. (7) 



Therefore, the coded part of codeword in ml^ packet, can be obtained 

through the PCC condition and (m& — 2) decoded received packets as well. 
However, there is no PCC relation among the uncoded part so the question 
is how to determine the uncoded part ^ of i*^ codeword for packet. 

We know that the coded part ^ decides which subset will be selected in 
the constellation, and the parallel transitions error in the subset can be ignored 
at high SNR’s. Therefore, after ^ is determined, ^ can also be decided 
by selecting one point i^^b which is the closest to the received signal 
Rnib,i in this subset. Now we have 



Ami-1, i ~ log[P(i?iy, • • • , Rmb-2,ii Rrub A^ryib-lR ^rrZ-l,i)]^ 

= — log[P(i?l,i, • • • , Rm,b-2,ii Rmb,i\Vmb,i ® (8) 

Exact calculation of (8) is very computation expensive and almost practically 
impossible when mb is large. Therefore, we approximate (8) as 



\(p) 

^mf, — 1,2 



-l0g[P(Rm,.*|P,^„, © Wmb-2,u VZ,b)]- 



(9) 



It is worth mentioning that ^ will not be the correct metric and the error 

propagation can result if Wm 6 - 2 ,i ^ Wmb- 2 ,i- The effect of error propagation 

(p) 

can be reduced by scaling down the value of i ^ith a scale factor a. 

So far, we have got the new metric functions for TCM systems combined 
with bootstrap structure. The corresponding bootstrap iterative decoding using 
Viterbi algorithm (BIVA) for TCM can be summarised as follows. 

(a) {mb — 1) X / information symbols are arranged as a block of {mb — 1) rows 
and I columns, where each symbol includes k coded bits and {k — k) uncoded 

^ Actually, • • • , Rmb,i are weakly dependent because of the PCC condition among 
them. 
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bits. Encode each row of I information symbols into codewords of length n 
using a, k/n convolutional encoder of memory length ly. All rows are encoded 
through the same convolutional encoder. 

(b) Generate the parity check row using the PCC condition. If it is only 

necessary to protect the coded bits, the coded bits of the symbol will 

be the parity of the coded bits of the symbols of the previous {mt — 1) 
information packets, and its uncoded bits will continue to be information 
bits. 

(c) In the receiver, decode the first {mb — 2) packets based on the VA and update 
the PCC condition through decoded packets. 

(d) Decode the next packet based on the BIVA, using the new metric which 
takes into account the extrinsic information from other packets. 

(e) Update the PCC condition based on the newest decoded values. 

(f) Repeat step (d) and (e) for several iterations until the stop criterion is sat- 
isfied. 

4 Variation of BIVA for Packet Transmission 

In this section, we will study the BIVA decoding in packet transmission format. 
In previous sections, several dummy symbols in each packet are needed to make 
the decoding of the last information symbols reliable. Such dummy symbols can 
be eliminated using tail-biting [9]. In tail-biting, the encoder is first initialised 
by inputing the last v information bits into the encoder and ignoring the output. 
Therefore, the start and end encoder states are constrained to be identical; that 
is, a trellis codeword starts from the state at which it will eventually end. 

4.1 One Dimensional (1-D) Bootstrap Structure 

We now can use the tail-biting technique in each packet to encode the whole 
packet information symbols. But it is noted that the much computation will 
be taken if each packet applies tail-biting. In this subsection, we will present a 
modification of the BIVA to cut down such computation without effect on its 
error performance. 

The key concept of I-D bootstrap structure is to connect all the row packets 
into one big packet. In other words, instead of terminating each row packet the 
information symbols of all row packets will continue to feed into the encoder 
until the last symbol. 

Next we consider how to update the PC condition W in the 1-D BIVA. The 
Viterbi decoder progresses over the trellis for a certain depth (i.e., truncation 
length), it then produces its decision result. Ideally, updating for one packet 
should be based on the decoding history of the other packets, namely, W should 
be updated based on the newest decoded values of the other packets. In the 1-D 
BIVA, however, the decision results of any packet can not be determined until 
the whole super packet is finished. Thus, for all packets, IV is updated only once 
in every iteration. Simulation results show that the way W is updated has little 
effect on the error performance and iteration number. 
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4.2 Two Dimensional (2-D) Bootstrap Structure 

We can re-apply the bootstrap structure on the structure in the above subsection 
4.1 to build up a two dimensional bootstrap structure. The main motivation is 
to build up a block code with a large minimum free distance, but decodable by 
an iterative BIVA with a very low complexity. The 2-D bootstrap structure is 
illustrated in Fig. 3. In the 2-D structure, there are two types of parity-checks 
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packet 1 ' 


packet li 


packet 2i 


packet mb,\ 
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packet 2 


packet I 2 


packet 22 


packet nib , 2 


super parity . 
packet B 


packet 1b 


packet 2b 


packet mb,B 



Fig. 3. 2-D bootstrap structure. 



which can produce two sorts of parity metrics^. Type (a) constraint provides the 
parity metric given in (9) for each super row packet. Type (b) constraint provides 
the parity metric for each column in the structure given in Fig. 3. Again, both 
parity metrics will be used after the first iteration. Updating the parity metrics 
for type (a) and (b) constraints is exactly the same as the procedure given in 
the above subsection. 



5 Combining Shaping Techniques with Bootstrap TCM 
System 

It has come to be recognised that shaping and coding are two separable and 
complementary components of the TCM systems. In [10], it has been shown 
that shaping gain can be achieved by using nonuniform, Gaussian-like signalling. 
One of the approaches, called trellis shaping, was proposed by Forney [10]. It was 
shown that a simple 4-state shaping code can achieve about 1.0 dB shaping gain, 
which is about 2/3 of the full 1.53 dB ultimate shaping gain. In this work, we 
concentrate on trellis shaping with TCM systems using the bootstrap iterative 
decoding algorithm. A TCM system using BIVA with shaping techniques is 
shown in Fig. 4. A 2"-point 2-D constellation is used to transmit k = kc + nu + Vg 

® A third type of parity check constraint can be contemplated for the 2-D strncture 
which is a combination of row and column check sums. However, it is found that this 
posed computational and technical difficulties which did not significantly contribute 
to the performance. 
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Fig. 4. Coded modulation system with trellis shaping. 



information bits/T, where kc, n„ and are the number of channel coded bits xi, 
uncoded bits wi and shaping coded bits si, respectively. The 2-D constellation 
is first partitioned into 2”“= subsets using mapping by set partitioning [1]. The Uc 
coded bits yi produced by the channel encoder Gc at time I specify one of the 
2"<= subsets. The 2-D constellation is also divided into 2"‘> subregions. The Ug 
shaping bits Zi produced by the shaping encoder at time I specify one of the 2”‘> 
subregions. The uncoded bits wi at time I specify a point a = M{yi,wi, zi) 
in the constellation for a given subregion and a given subset. At the receiver, a 
Viterbi decoder with BIVA can be used to obtain^e estimate of the transmitted 
information bits. To update the PCC condition Wj^i, one channel encoder Gc is 
needed to re-encode the decoded bits xi at the receiver. 

It was shown in [7] that the free distance of block code is doubled by 1-D 
bootstrap structure. We note that the free distance is just determined by coded 
bits xi for such TCM schemes in which is larger than cPfree,c- Therefore, 

the shaping codes s/ should not affect the distance property and the full shaping 
gains will be achieved when shaping techniques are combined with TCM systems 
using BIVA. 



6 Numerical Results 

In the previous sections, the BIVA for bootstrap TCM systems has been pro- 
posed and shaping technique integrated with such TCM was also discussed. In 
this section, we report some simulation results to compare the performance with 
the Viterbi algorithm. 

In our simulation, each superblock has 20 x 20 packets in 2-D bootstrap struc- 
ture, i.e., mb = 20 and B = 20, and each packet includes Tiy symbols. The largest 
iteration number is set as 100. If the stop criterion is meted within 100 iterations, 
the decoding process will be stopped. The stop criterion in bootstrap decoding is 
straightforward. If the coded part of decoded symbols (part protection) in all mb 
packets (1-D bootstrap structure) or {mb x B) packets (2-D bootstrap structure) 
satisfy the original PCC condition, i.e., W = = Oj then the itera- 

tive decoding process is stopped. Figure 5 shows the performance of 2-D BIVA 
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Fig. 5. Performance of = 8 trellis codes at a spectral efficiency of 6 bits/T 
with part protection using 2-D BIVA. 



with part protection for 2-D 256-state 6 bits/T TCM scheme, combined with 
trellis shaping technique. 16-state shaping code is applied in our simulation. The 
16-state and 256-state TCM are Ungerboeck’s codes [2]. For comparison, the 
performance of such TCM scheme using VA is also reported. The results show 
that about 2.0 dB gross gain can be achieved beyond the VA without shaping 
and about 2.7 dB gross gain with shaping. However, noted that the shaping gain 
is smaller at low SNR’s and only about 0.7 dB shaping gain was obtained in 
such TCM scheme with 2-D bootstrap structure (similar case can be found in 
[ 11 ])- 

We also can see the error floor in Fig. 5. This error floor is mainly dominated 
by the parallel path error at relative low SNR’s. Such errors can be reduced 
through multilevel error protection techniques. 

It is noteworthy that each block includes some non-information bits which 
cause the real transmission rate to be lower than the nominal rate, namely, the 
parity bits introduced by the bootstrap structure offset a part of achieved gains 
and result in practical rate loss. In our case, the real rate now is 5.805 bits/T, 
Shannon limit for this rate is 9.76 dB. Therefore the performance of 1.25 dB away 
from Shannon limit at BER = 3 x 10“® is achieved by such a TCM scheme. 

Finally, we have to point out that the scale factor a mentioned in section 
3 is critical to the performance and convergent speed. The average iteration 
number varies according to the different SNR’s and different codes. Generally, 
less iterations are required at higher SNR’s or by larger memory length code. 
In our simulation, for v = S code, the average numbers of iteration are 45 and 
13 for SNR=11.01 dB and 11.1 dB, respectively. It is shown that the peak and 
average complexity can be significantly cut down if we increase SNR by 0.1 dB. 
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7 Conclusions 

A bootstrap iterative Viterbi algorithm has been proposed and new metric func- 
tions which take into account the extrinsic information based on the bootstrap 
TCM systems have also been derived. Additional gains for trellis codes using the 
BIVA instead of VA can be obtained with low complexity and reasonable compu- 
tation. For TCM schemes in which g < d'^araiiei^ protecting the coded bits 
instead of the whole symbol can achieve most gain with less redundancy. 1-D 
and 2-D bootstrap structures are also designed to be suitable for packet trans- 
mission. We can see that such bootstrap decoding can be applied with existing 
systems such as ADSL, etc. Furthermore, trellis shaping technique is employed 
on bootstrap TCM systems and full shaping gain can be achieved. In the simu- 
lation, the performance of 1.25 dB away from Shannon limit is achieved by 2-D 
BIVA for 2-D 256-state 6 bits/T TCM scheme combined with trellis shaping. 
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Abstract. Based on h reference words, Kasami et.al. proposed an in- 
teger programming problem (IPP) whose optimal value of the objective 
function gives a sufficient condition on the optimality of a decoded code- 
word. The IPP has been solved for 1 < h < 3. In this paper, an algorithm 
for solving the IPP for h = 4 is presented. The computational complexity 
of this algorithm is investigated. 



1 Introduction 

Let IDA denote a soft-decision iterative decoding algorithm for a binary block 
code. For most IDA’s, in each successful decoding step a candidate codeword 
is generated by a simple decoder and an optimality testing condition is tested 
on the candidate codeword. When the testing condition is satisfied, the decod- 
ing iteration process is terminated and the optimal (or most likely) codeword 
is obtained. A number of testing conditions have been derived, such as those 
proposed in [1] and [2]. Recently, based on h reference words, Kasami et. al. in 
[3,4] proposed an integer programming problem (IFF), whose optimal value of 
the objective function gives an optimality testing condition and can be incorpo- 
rated in any IDA which is based on the generation of a sequence of candidate 
codewords. This testing condition with h = 3 was used effectively in the iterative 
decoding algorithm presented in [5] . It was shown that this testing condition can 
provide fast termination of the decoding iteration without degrading the error 
performance. It is pointed out in [6] that the approach used in the derivation of 
this testing condition can also be used to derive other two important conditions 
for the IDA’s. One is the ruling-out condition, which can be used to skip useless 
decoding steps. If the ruling-out condition holds in some decoding step, then the 
output of the next decoding step can not be better (or more likely) than the 
best candidate codeword obtained so far. Another is a stronger version of the 
ruling-out condition, called early termination condition, if the early termination 
condition holds in some decoding step, then all the successive decoding steps can 
not generate any candidate codeword which is better than the best candidate 
codeword obtained so far, that is, there is no improvement on error performance 
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by any further iteration. In [7], the approach is also used to find a good sequence 
of search centers around which algebraical decodings are iterated. 

In [3,4], the computational complexity for solving the IPP was investigated 
for 1 < h < 5, the number of additions and comparisons of real numbers, which 
is the majority of the computational complexity, for solving the IPP was shown 
to be of order N, where N is the code length. For larger h, this testing condition 
is stronger while the computational complexity for solving the IPP grows. It is 
desirable to give effective methods to solve the IPP for relatively large h. 

In this paper, we consider to solve the IPP for ft, = 4. In Section 2, we will 
review briefly the IPP proposed by Kasami et. al. in [3,4] and [6]. In Section 3, 
an algorithm for solving the IPP for ft = 4 is presented. The IPP is split into at 
most 9 sub-IPP’s, the number of the variables of each sub-IPP is half of that of 
the original IPP. The number of additions and comparisons of real numbers for 
this algorithm is shown to be of order N"^. In Section 4, each sub-IPP is split 
further into a few simpler subsub-IPP’s that are solved by simple iterations. The 
proofs of the theorems and lemmas appeared in Sections 3 and 4 can be found 
in [8]. 

2 The Testing Condition of Optimality 

For a positive integer N, let denote the set of all binary A^-tuples over 
V = GF(2). Suppose a binary block code C C is used for error control over 
the AWGN channel with BPSK signaling, and r = (ri, T 2 , . . . , r^) is a received 
A^-tuple at the output of a matched filter in the receiver. Let z = {z\, Z 2 , ■ ■ ■ , zn) 
be the binary hard-decision A^-tuple obtained from r using the hard-decision 
function: Zi = 1 for > 0 and Zi = 0 for ri < 0. |rij indicates the reliability of 
Zi for each i. 

For u = {ui,U 2 y- ,un) G , define Vi{u) = {i : m ^ Zi, I < i < N} 
andPol^t) — {1,2,... ,IV}\I?i(m). L{u) = kd is called the correlation 

discrepancy of u with respect to the hard-decision tuple 2 : [3,4]. For any subset T 
of , let L[T] = min^gT L{u), and write L[0] = -l-oo. The maximum likelihood 
decoding (MLD) can be stated in terms of the correlation discrepancy as follows: 
The decoder decodes the received tuple r into the optimal codeword Copt G C 
with L(copt) = L[C]. 

For u G and positive integer d, let Od{u) A { 1 ; g : du{u,v) < 
d}, where dB_{u,v) is the Hamming distance between u and v. Let ft be a 
positive integer and 1 x 1 , M 2 , Uh be ft reference words in . Let R = 
where dj is called the preassigned radius for the reference word 
Uj. Write ^^^(ui,U 2 ,. . . ,Uh) = V^\R, that is 

- ,Uh) = {vev^ : dn{uj,v) > dj for 1 < j < ft}. (1) 



Lemma 1. If a codeword c^est i'n R satisfies 

(mi,M2, . . . ,Uh)], 



( 2 ) 
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then the optimal codeword Copt must belong to the set R. Furthermore, we have 
^best — ^opt if L(^^best) — Ll[^ -^] ■ 

Lemma 1 not only defines a region which contains the optimal codeword 
Copt but also provides a sufficient testing condition for the optimality of candi- 
dates which have been generated. The selection of the reference words and the 
preassigned radiuses effect quite the efficiency of the results of Lemma 1 . If we 
incorporate it in a soft-decision iterative decoding algorithm, we can select the 
reference words from the previously generated candidate codewords, which are 
trusted to possess low correlation discrepancies, and take “the covering distance 
around assured by the decoding algorithm as the preassigned radius for the 
reference word Uj . Here we do not consider the problem of selection of the refer- 
ence words and the preassigned radiuses further. The main problem we concern 
in this paper is to evaluate L[V ^ . . . ,Ufi)]. Below we will intro- 
duce an IPP proposed by Kasami et. al. in! [3,4], the optimal value of this IPP 
was shown to be equal to L[Vj^ (iti , 162 , . . . ,Uh)]- 

Let denote the set of all /i-tuples over B = {0, 1}. For simple notation, 
the /i-tuple a = (ai, 02, ■ • • , Oih) may be represented as ai«2 • • • cth- Define 

h 

(3) 

Let Ql^ denote the set of all the 2^-tuples over nonnegative integers. For q e Q^, 
each of the 2^ components of q is referred as qa with ct € B^. Let Qh denote 
the set of all the 2^-tuples q G Ql^ which satisfy 

0 < 9a < ria, for all a S B^, (4) 

qa{-^T* > Si =d^-\T>i{ui)\, for i = 1,2,... ,h. (5) 

ctes'* 

Without loss of generality, we assume further that the components of the 
received tuple r are ordered in the increasing order of their absolute values 

ki| < \r2\ < ■ ■ ■ < kArj. (6) 

For convenience, we define r* = -l-oo. For any subset X C {1,2,... ,N,*} 
and integer j, let denote the set of j smallest integers in X\{*} if 1 < j < 
|AT\{*}|, the set X U {*} if j > |X\{x=}|, the empty set 0 otherwise. For q G 

let V{q) = L'{q) = Y.i(^v{q) k^l- For nonempty subset Q' C Q*, 

let L[[Q'] denote the optimal value of the following IPP: 

V{Q'): Minimize {L'{q) \ q G Q'}, 

i.e. L/[Q'\ — minqeQ' L'{q). For convenience, we write L'[0] = -l-oo. A 2^-tuple 
<7 G Q' is called a Q' -optimum if L'{q) = L'[Q']. 

It was shown in [3,4] that L[Vj ^ ■ ■ • ,Uh)] is equal to the op- 
timal value L'lQh] of the IPP V{Qh) if the received tuple r satisfies (6), i.e. 




204 Yuansheng Tang, Tadao Kasami, and Torn Fujiwara 



Theorem 1. If the received tuple r satisfies (6), we have the following formula: 

If < 0 for i = 1,2,... ,h, then the all zero 2^-tuple q ~ 0 belongs to Qh 
and l/[Qh] = 0, this implies that the hard-decision tuple 2 ; belongs to the set 
^dxd .2 '*^ 2 , ■ ■ • , Uh). Without loss of generality, hereafter we assume that 

the components of 5 = (i5i, ^ 2 , • ■ ■ , 5h) satisfy 



hi > S 2 > ■ ■ ■ > Sfi, (5i > 0. (8) 

For each h with 1 < ft. < 3, the IPP V{Qh) was solved in [3,4]. The main 
results on the IPP V{Qh) obtained in [3,4] can be summarized as the following 
three results: 

Result I. If ft = 1 and (6) hold, then 

nm= E 1^*1- (9) 

Result 2.li h = 2 and (6) and (8) hold, then 

u[Q2]= E 1^*1- (19) 

*eCDooU'D<L('5i-‘*2)/2J))(ii) 

Result 5. If ft = 3 and (6) and (8) hold, then 

Ll[Qz] = min min El'’*!’ (H) 

U&Yj(k) 

where ki = min{fti, [(fti - ft2)/2j, [(fti - ft3)/2j}, = min{]"(ft 2 -l-ft3)/2l, [(fti -I- 

ft3)/2] , [(fti -I- ft2)/2] } and 

Yi{k) = U (T>ooo U U p(E-‘53)/2J-fc))(5i-fc)^ (12) 

Y (m A I I j\i\(S2+S3)/2]—k) I , j^(\(Si+S3)/2']—k) . . j^(\{5i+&2) /2'\—k) o, 

Furthermore, for each j, there is a ft' with 0 < ft' < kj such that ^i^Y (k) 1^*1 
is non-increasing in [0, ft'], non-decreasing in [ft', kj], respectively. Since with at 
most 4 operations of additions and comparisons of real numbers we can determine 
whether ki|~X[i6y,(fc) k*! > 0 holds or not, the number of operations 

of additions and comparisons of real numbers for solving the IPP V{Qfi) is of 
order N . □ 

For greater ft, the testing codition is stronger while the computational com- 
plexity for solving the IPP V{Qh) grows. It is pointed out in [9] that, under 
the assumption of correlation uniqueness, i.e. L{u) yf L{v) for any different tu- 
ples u and V in V^, the inequality l[[Qh+i] > LllQh] holds if and only if the 
Qft,-optimum is not in Qh+i- We will solve the IPP for ft = 4 in this paper. 
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3 Algorithm for Solving the IPP 

In this section, we will present an algorithm to solve the IPP V{Qa) by splitting 
the IPP V{Q4) into at most 9 sub-IPP’s, the number of variables of each sub-IPP 
is only half of that of the IPP V{Qi). 

For any q G Ql, write S{q) = {a G : qa > 1}. For a G B^, let ichIo) 
denote the Hamming weight of a. Let Y = {a G B'^ : w-n{a.) < 1}. For any 
£ with 1 < £ < 4, we call = {ol G B"^ : ai = 0} and Di = Y {a. G 
B^ : Wh(o) = 2,a^ = 1} and = Y U {cn. G B'^ : Wa{oi) = 2, ae = 0} are 
Ai-sets. For each A4-set E, there is one and only one sequence in E, denoted 
a(S'), such that p{E) = {p{E)i, p{E) 2 , p{E) 3 , p{E) 4 ) = 5 + a( S') is a 4-tuple 
over even integers or a 4-tuple over odd integers. Let Q{Ci) denote the set of 
those 2^-tuples q G Qi which satisfy S{q) C Ct and A{q)i = p{Ce)e = Sg. Let 
Q{Di) denote the set of those 2‘^-tuples q G Q 4 which satisfy S{q) C Df and 
A(q)j = p{Di)j for all j with j yf £. Let Q{Ei) denote the set of those 2^-tuples 
q G Q 4 which satisfy S{q) C Ei and A{q)j = p{Ei)j for j = 1, 2, 3,4. 

Theorem 2 . If the received N -tuple r satisfies (6), we have 

L'[Q4] = L'[Qmi„], (14) 

where Qmin = U Q{Di) U Q{Ee)). 

In general, for some Ad-sets E the sets Q{E) are empty or covered by the 
others and thus should be excluded from further consideration. 

Theorem 3 . If the 4 -tuple S satisfies (8), then we have 

Qmin= U Q{A), (15) 

se{Ci}U^<* 

where H* is a set of Ai- sets defined as follows: 

(i). 0 if 62 + S3 -\- 1 < 0 and < 0/ 

(a). {E{\ if S2 + 63-11 <0 and J 2 j=i P(^i)j > 0 / 

(Hi). {04} z/ (52 -I- 1^3 -I- 1 > 0 and ( 5 i -I- ^4 -I- 1 < 0; 

(iv) . {El, D4} if 62 + S3 -\- 1 > 0 and (5i -I- i54 -I- 1 > 0 and i52 -I- (54 -I- 1 < 0; 

(v) . [Ei,E 2 , 03,04} if 62 + S4 -\- 1 > 0 and (5s -I- (54 -I- 1 < 0; 

(vi) . {El, E2, E3, E4, 02, O3, O4} if 63-1 S4 + I > 0 and p{Oi)i > 

(vii) . {El, E2, E3, E4, Oi, O2, 03, O4} if p{Oi)i < Y{j=2P(^i)r 

If both of (6) and (8) are valid, from Theorems 2 and 3, we see easily that to 
solve the IPP V^Qi) it is needed only to solve the IPP V{Q{E)) for the Ad-sets 
E in {Cl} U H*. For each Ad-set E G {C\, Oi, O2, 03, 04, Ei, E2, E3, E4}, we 
give a Sub-algorithm-Q(S’) to solve the IPP V{Q{E)) in the next section. Using 
these sub-algorithms we can solve the IPP 7^ (< 54 ) by the following algorithm. 

Algorithm for Solving the IPP V{Q4) 

Input. The A-tuple r with (6). The 4-tuple 6 = (5i, (52, (5a, (54) defined by (5) and 
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satisfying (8). Va and Ua defined by (3) for all a G B'^. 

Output. L'[Q 4 ] = 

Step 1. By the definitions, determine the 4-tuples p{D\) and p{E\). And then 
generate the set H* by Theorem 3, and solve the IPP V{Q{E)) for each S G 
{Cl} U H* by the Sub-algorithm-(5(S'). 

Step 2. Output minsg{c'j}u!.(» l/[Q{S)] and END. □ 

Since for all of the sub-algorithms the numbers of additions and comparisons 
of real numbers are of order Sf , we see easily that the numbers of additions and 
comparisons of real numbers of above algorithm is of order Sf, and thus is of 
order iV^. 

4 Sub-algorithms for Solving the IPP’s 'P{Q{S)) 

4.1 Sub-algorithm for the Evaluation of I/[Q{Ci)] 

Let (if = maxjO, [(i5i -I- -I- 1)/2J|, i = 2, 3,4. For integers d, k with 0 < d < (ii, 

0 < /c < (if , let be the set of the 2Ctuples q in Ql which satisfy (4) and 

<700 00 + <70001 = d, ^ <7a = — d, 

olGC'i 

' <700 1 0 + <70011 ^ ^2 — d, (70100 + <7oioi ^ — d, (16) 

<700 00 + <700 1 0 + <70 1 00 + <7oiio > k, 

, = 0, for a ^ Cl, 

where C{ = CiXjOOOO, 0001}. Clearly, C C ••• C C Q}. Let 

A maxjO, Si~J2a(^c['^a, <if-nooii-?4ooio, <if-noioi-iioioo, <if + ^f-<ii} 

and d*' = min{(ii, noooo + ^oooi}) then is not empty if and only if d satisfies 
d} < d < dk . We can show easily that 

Q(Ci)= U (17) 

For any d with d} < d < d^ , let 

v{d) = 6i — d — max{(if — d, 0} — max} (if — d, 0} > 0, (18) 

V (d) = (l?ooii U T’ooio)^’^^ U (l?oioi U T’oioo)^'^^ (19) 

V*{d) ^ V{d) U (Poooo u Poooi)^'") u ((U„ecj2?o)\P(d))("(''», (20) 

and let q‘^’^ denote the 2‘^-tuple which satisfies V{q‘^'^) = V*{d). Then is an 
jjd.O-optimum. We will show an iterative method to find an -optimum, or 
determine = 0, from 

For any 2^-tuple q G Q% and two sequences a, ol' G B^, let (p{q, a, a') denote 
the 2‘^-tuple which satisfies 

f tp{q, OL, OL')a = <7a + 1, </j(g, OL, Ol')oc' = Qoc' ~ 1, 

} tp{q, OL, OL')a" = qoc", for all ol” G B"^\{a, a'}. 



( 21 ) 
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Lemma 2. Assume q‘^’^ is an -optimum which satisfies 

d,k I d^k I d,k I d^k ? 

9oooo + 9ooio + ^OlOO + 9oiio ~ 

Let ol, ol') '■ {ol, Oi') € where A = {(0000, 0001)}U 

({0010,0100,0110} X {0011,0101,0111}). Then = 0 if and only if^{q‘^'’^) 

= 0, and the ^{q’^'^)- optimum must be an -optimum if ^{q‘^’^) ^ 0. 

Assume q’^'^ is an i?'^’^-optimum which satisfies (22) and ^(q”^’^) yf 0. Since 
between L' {(p{q'^’^ , ol,ol')) and L'{ip{q‘^’^ , (3, f3')) with a = (3 or a' = 0 we can 
find the smaller one with no operations of real numbers, we see easily that to 
determine the ^(q'^’*)-optimum it is enough to consider L'{q) — L'{q‘^’^) for at 
most 4 tuples q of $(q‘*’*) and thus with at most 7 operations of additions and 
comparisons of real numbers we can find an A'^’^+^-optimum from Hence, 
by iteration from q‘^’'^ with at most 76^ operations of additions and comparisons 
of real numbers we can find an -optimum, or determine = 0. With 

respect to (17) and the definition of the -optimums, we can give the fol- 
lowing Sub-algorithm-Q(Ci) to compute i^[Q(Ci)]. 

Sub-algorithm-(5(C'i) 

Input. The fV-tuple r with ( 6 ). The 4-tuple <5 = (5i, 62 , 1 ^ 3 , 1 ^ 4 ) defined by (5) and 
satisfying (8). T>a and Ua defined by (3) for all a G Ci. 

Output. T^[Q(C'i)]. 

Step 1. Compute 5'j , j = 2,3,4 and d} and d’^ . 

Step 2. If d' > d’', then output -l-oo (i.e. Q{Ci) = 0) and END, otherwise, for 
each integer d with d' < d < d*', generate the set V*{d) and determine the 
2^-tuple q'^'^ with T>{q^’^) = V*{d), and then, by using of Lemma 2, find an 
-optimum ^ or determine = 0. 

Step 3. Output min^i^^^^r L'(q'^’‘^4 ) and END. □ 

The number of d with df < d < d'^ is at most di -I- 1. For each d with 
d} < d < d*', it needs at most 7^4 < 7di operations of additions and comparisons 
of real numbers to find an i?'^’‘^4 -optimum q‘^>'^4 or to determine i?'^’‘^4 = 0. To 
compute L'(q‘^’'^4 ) it needs di — 1 additions of real numbers. Hence the total 
number of operations of additions and comparisons of real numbers for Sub- 
algorithm-Q(C'i) is not more than 8di((5i -I- 1). 

4.2 Sub-algorithm for the Evaluation of 

For any integer k with 0 < /c < 15, let b{k) = {b{k)i,b{k) 2 ,b{k) 3 ,b{k) 4 ) denote 
the sequence in which satisfies k{k)j2l~^ = k. Clearly, Q{Di) consists 
of the tuples q in which satisfy (4) and 

3 

- 2(qsi + q*') = p{Di)ai, for i = 1,2,3, 

< ^=0 

3 

- 9s') > Si, and qa = 0 for a ^ Di, 

. j=o 



(23) 
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where so = 0000, Sg = b(2^ and for i = 1,2,3 



tti = 



A f i + if i + ^ < 4, 



z + ^ — 4, otherwise. 



, Si ^ 6(2“^-'), s' ^ + 2"-^). (24) 



For i = 1,2,3, let /2 = (Eje/i > where li = {1,2,3}\ 

{z} for i = 1,2,3. Let 5° = max{0, \{Se + Ej=i d(^^)a,)/2l} = max{0, + 
Ej=i For nonnegative integers d, k, let denote the set of the 2^- 

tuples q in Ql which satisfy (4) and 



Qso + Qs'g = d, qso + Qsi + qs2 + qs^ > k, 
dsi + qs'. = (5f - d, for z = 1, 2, 3, 
qa=0, for a ^ Di. 



(25) 



Clearly, C ^ C • • • C R‘^'^ C Q|. Let d’' = min{rzso + , ^2 : '^3 } 

and d* = max{0, maxi<i<3((5f — Ug^ — zZg')}. Then we can prove easily that 



Q{Di) = U 

d^<d<d^ 



-d 



(26) 



For any d with S < d < (f, let < 7 '*’° denote the 2'^-tuple which satisfies 

= U (P,, (27) 



Ki<3 



Then q ■ must be an -optimum. The following lemma suggests an iterative 
method for finding an R'^’^ “'^-optimum, or determining R‘^’^ = 0 , from . 



Lemma 3. Assume q^'^ is an R'^’^ -optimum which satisfies 



■d,k 

'iso 



■d,k 



■d,k 



qf^ = k. 



(28) 



Let f{q‘^’^) = {‘p{q‘^’^, Sj,s'j) : 0 < j < 3} n Then R<^T+i = % if and only 

if i{q‘^’’^) = 0 , and the i{q^’^)~ optimum is an -optimum if f{q‘^’’^) yf 0 . 

Similar to the Sub-algorithm-Q(Ci), we can devise a Sub-algorithm-(3(dl^) 
to compute L/[Q{Di)]. The detail of Sub-algorithm-(5(L>^) is given in [ 8 ]. The 
number of operations of additions and comparisons of real numbers for Sub- 
algorithm-Q(Il£) is not more than 25Si{6i -\- 1 )/ 2 . 



4.3 Sub-algorithm for the Evaluation of L'[Q(E^)] 

Clearly, Q{Ef) consists of the 2^-tuples q in which satisfy (4) and 

{ Qso + + X! ~ ) “ dsi + ds* = p{Ee)ai , for z = 1, 2, 3, 

dso - ds'o + ^{ds^ + qs-) = p{Ei)t, and = 0 for a ^ Ei. 



(29) 
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where s* = 6(15 - i = 1,2,3. For i = 1,2,3, let = {p{Ee)e + 

p{Ee)a,)l2= \{5i + 5a,)/2-\. Let 



5 "= = 



\jZpmo 



(5f + (5f + — 5i, if 6^ is even, 

-((5^ + 1), otherwise. 



where 5^ = - Then the 2‘^-tuples q in Q{Ei) can be given by 



(30) 



qs* = Sf - Xi, qsi = Xj - w, for i = 1, 2, 3, 

(31) 

qs'^ = 6^ -w, qsa = 2w - xi - X2 - X3, and qa = 0, for a ^ E^. 
where w,xi,X 2 ,xs satisfy max{0,<5® ~ ’^s'„} <w < and 

0 < 2rc — xi — X2 — X3 < Usg , 

maxjO, Sf — fis*} < Xi < Sf, 0 < xj — w < Usi, for t = 1, 2, 3 

jeh 

For maxjO, 5"^ — < w < S^, let f2{w) denote the set of pairs tt = 

(xi,X 2 ,X 3 ) which satisfy (32) and (33). For tt G f2{w), let q«,(7r) be the 2^-tuple 
of Q{Ei) defined by (31). We write L^{tt) = L'(q„(7r)). For any subset Q' of 
Q{w), let Ly„[Q'\ = miuTrej?' iu,( 7 t), and write Lu,[0] = +00. If a pair tt G 12' 
satisfies Lyj{Tr) = Lyj\f2'], we call it f2'-pair. We will show a method to evaluate 
L'[Q{Ei)] by finding an l7('u;)-pair for each w with 12(w) yf 0. 

For integers w, x, let f?(w,x) = {tt G f?(w) : tt = (-,-,a;)}. At first, we con- 
sider to determine 12(w, x)-pairs. Let T denote the set of pairs tt' = (x[,X 2 ,X 3 ) 
with x'i G {1,0,—!}. For tt G f?(w) and tt' G T, we say that we can grow tt in 
the tt' -direction to 7r-|-7r' if Lw{tt) > Lu,(7r -|- tt'). For any nonempty set Q{w,x), 
we can find an fi(w, a;)-pair by the following growth procedure. 



(32) 

(33) 



Growth Procedure of an 12(ix;, a;)-pair t{w,x) 

Preparation. Select an arbitrary pair tt of f2{w,x) as the seed. 

Step 1. We consider to grow tt in the (1, 0, 0)-direction and (— 1, 0, 0)-direction 
till we can not do anymore and then goto Step 2. 

Step 2. If we can grow tt in Tr'-direction for some tt' of {(1, 0, 0), (—1, 0, 0), (0, 1, 0), 
(0, — 1,0)}, we grow it in Tr'-direction step by step till we can not do anymore 
and then goto Step 2, otherwise goto Step 3. 

Step 3. We consider to grow tt in (1,-1, 0)-direction and (—1, 1, 0)-direction till 
we can not do anymore and then output t(w, x) = tt and END. □ 

Lemma 4. The growth procedure outputs an [2{w,x)-pair t{w,x) with at most 
16w operations of additions and comparisons of real numbers. 

Sometimes it is not easy to select a seed for this growth procedure. A concrete 
method for giving a seed for the growth procedure is shown in [8]. However, it 
is not needed to find all the f2{w, x)-pairs by the growth procedure. Indeed, the 
following lemma suggests a simple method for finding a 12('u;)-pair from a known 
f2{w, cc)-pair. 
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Lemma 5. Let t{w,x) be an Q{w,x)-pair and T* = {(0, 0, 0), (— 1, 0, 0), 
(0,-l,0), Then 

1. For e G {1,-1}, if f2(w,x + e) 7^ 0, then there exists a pair tt' in T* such 
that t{w, x) + (0, 0, £:) + £• 7t' is an Q{w, x + e)-pair. 

2. t{w,x) is an f2{w)-pair if and only if Lw{t{w,x)) < Toi\i{Ly^[Q{w,x — 
l)],Lyj[fi{w,x + 1)]}. 

For £ G {1, —1}, according to Lemma 5, from an fl(w, x)-pair t(w, x) we can 
get an Q{w, x + £)-pair t{w, x + e) with at most 20 operations of additions and 
comparisons of real numbers, and determine whether Lw{t{w, x)) < x+ 

£)) holds or not with 5 more operations of additions and comparisons of real 
numbers. Then, with respect to Lemmas 4 and 5 and f2{w^ x) = 0 for x > w, we 
can find an l7(r(;)-pair t{w) with at most \Qw+2bw /2 operations of additions and 
comparisons of real numbers. Furthermore, since the number of additions and 
comparisons of real numbers for computing Lw{t{w)) is + + <5f — 2 w — 1 

and = 0 for ic > we can devise a Sub-algorithm-(5(i?^) to evaluate 

L[[Q{Ei)\ with at most + (5f + <5f) < 63(5i(25i + l)/2 

operations of additions and comparisons of real numbers. The detail of the Sub- 
algorithm-Q(£’f) is given in [8]. 
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Abstract. In [2,4] the notion of a recursive code was introduced and 
some constructions of recursive MDS codes were proposed. The main 
result was that for any q ^ {2,6} (except possibly q £ {14,18,26,42}) 
there exists a recursive MDS-code in an alphabet of q elements of length 
4 and combinatorial dimension 2 (i.e. a recursive [4, 2, 3]q-code). One of 
the constructions we used there was that of pseudogeometries; it enabled 
us to show that for any q > 126 (except possibly q = 164) there exists 
a recursive [4, 2, 3]q-code that contains all the “constants”. One part of 
the present note is the further application of the pseudogeometry con- 
struction which shows that for any q > 164 (resp. q > 26644) there exists 
a recursive [7, 2, hj^-code (resp. [13, 2, 12]g-code) containing ’’constants” . 
Another result presented here is a negative one: we show that there is no 
nontrivial pseudogeometry consisting of 14, 18, 26 or 42 points with no 
lines of order 2, 3, 4 or 6, so the pseudogeometry construction cannot be 
applied for settling the question mentioned in the above. In both cases 
the usage of computer is essential. 



Introduction 

A code K. C 17" in an alphabet 17 of q elements is called k -recursive, 1 < k < n, 
if there exists a function / : 17^ ^ 17 such that /C consists of all the rows 
■u(0, n — 1) = (m(0), . . . , u{n — 1)) G 17" with the property 

u{i k) = k — 1)), iG0,n — k — l. 

In other words K. is the set of all output n-sequences of a feedback shift register 
with a feedback function /. We denote JC = JC{n, /) and investigate the existence 
problem for MDS-codes of such type, i.e. recursive [n,k,n — k Ij^-codes. 

In connection with this, we consider the following three parameters: 

n(k, q) - maximum of lengths of MDS codes /C of (combinatorial) dimension k 
(|/C| = q^) in alphabet 17 of cardinality q. 
n"(A:, q) — maximum of lengths of fc-recursive MDS codes of the same type. 
n^'^{k,q) - maximum of lengths of fc-recursive MDS codes of the same type 
which contains all the “constants”, i.e. all the words (a, . . . , a) : a G 17. 

It is clear that n"{k,q) < rf{k,q) < n{k,q). 
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Here we study only the case A: = 2. In this case our problem has the fol- 
lowing simplification. Let us define the i-th recursive derivation {x, y) of the 
recursive law fix.y) recursively: f^'^^x.y) = f(x,y), f^^^x.y) = fiy.fix.y)), 
f(^\x,y) = /(/(*-2)(a:,y), /(*-!) (x,j/)). The necessary condition for /C(n,/) to 
be MDS is that (f^,f{x,y)) is a quasigroup. We say that a quasigroup 12 is 
t-stable if (17, is also a quasigroup for i G l,t, 

It was proved in [2, Theorem 4] that n^{2,q) > m (resp., n^^(2,q) > m) if 
and only if there exists an (m — 3)-stable (idempotent) quasigroup f{x, y) of 
order q. 

The aim of this note is to show how the notion of pseudogeometries can be 
used to provide some estimations of n"{2,q). We show also some limitations of 
this technique. The term “pseudogeometry” was taken from [1] where pseudo- 
geometries were used to construct Latin squares which are orthogonal to their 
transposes. Since the construction of the quasigroups under consideration may be 
reduced to the construction of orthogonal Latin squares with special properties, 
it was natural to use pseudogeometries for this purpose also. 

Now we remind that a pseudogeometry is a pair V = (P,C), where £ is a set 
of non-empty subsets of the non-empty set P, if for any different x,y £ P there 
is a unique subset L{x,y) G £ such that {x,y} C L{x,y). The elements of the 
sets P and £ are called points and lines, respectively. We consider only the finite 
pseudogeometries. The cardinality of a line will be often called its length. Note 
that if one adds to or removes from £ any set of one-point sets, the pair (P, £) 
remains a pseudogeometry. We will say that a pseudogeometry V = {P,C) is 
nontrivial if P ^ £. The standard examples of pseudogeometries are affine and 
projective planes over finite fields. 



1 Idempotent Quasigroups of Large Orders 

For the convenience of the reader we recall here the definition of the accompa- 
nying pseudogeometry for a mutually orthogonal set of Latin squares [2] . Let q 
be some positive integer and suppose that there exists a mutually orthogonal set 
of s Latin squares /i, . . . , /« of order q, fi : {0,q— 1)^ — > 0, g — 1. Consider the 
set P of all pairs (x,y), x G 0, g — 1, j/ G — 1, s. Define lines of two types: 

(1) horizontal lines: 

Hy = {(x,y) : X G 0,q — 1}, for any fixed y G — 1, s; 

(2) skew lines: 

Sij = {(b-l),(j.O)} U {{fy{i,j),y) ■■ y G l,s}, for any fixed i,j G 0,g- 1. 

Let H = {Hy : y G —1, s}, S = {Sij : i,j G 0, g — 1}, £ = U 5. It is easy to see 
that (P, £) is a pseudogeometry with lines of length g and s-|-2. Later we use this 
construction mostly in the case of primary g, when one can take s = g — 1. Note 
that in this case one can obtain the accompanying pseudogeometry by removing 
one point from the projective plane over a field of order g. 
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Let (P, C) be an arbitrary pseudogeometry, and P' an arbitrary non-empty 
subset of the set P. Let C = {L f] P' : L € C,\L D P'| > 1}. It is easy to 
see that (P',£') is also a pseudogeometry. We shall call {P',£') the reduced 
pseudogeometry for the pseudogeometry {P,L). 

The following result is very close to one of the main constructions of [1]: 

Theorem 1 Let (P, L) he a pseudogeometry. Then 

n"{2, |P|) > min {n"{2, |P|) : P G £}. 

□ The proof follows that of [2, Theorem 10]. Let 

t = min jPj) : L G £} — 3 

and define an operation fr '■ —> L for every L G C in such a way that L 

becomes an idempotent t-stable quasigroup. Then for any x,y G P such that 
X yf y let f{x,y) = fL{x,y) where L = L{x,y) and let f{x,x) = x, for any 
X G P. It is obvious that the operation is well-defined and (P, /) is an idempotent 
quasigroup. So if y = /(x, y) then x = y, x,y G P. An easy induction argument 
shows that for any i G l,t, any L G C and any x,y G L, (x, y) = (x, y) 

implies x = j/. So if x yf j/, then f^^\x,y) = f^\x,y) where L = L{x,y), 
and (x,x) = X, for any x G P, hence (P, is again a quasigroup. The 
application of [2, Theorem 4] finishes the proof. □ 

Now let I{n) be the set of integers q such that n"{2, q) > n. It is known that 
2, 3, 4, 6 ^ P(4) and that r G X(4) for any q > 127 (except possibly q = 164) [2, 
Corollary 8]. The authors do not know if 10 G T(4) but any prime number p > n 
and any primary number q > n belong to P(n) as well as any product of such 
numbers (see [2, Propositions 10, 11]). 

The following theorem is a slight generalization of [2, Theorem 11], but it 
gives many extra numbers in the search described below. 

Theorem 2 Let n he an integer, n > 4, and s,t,l,m,d\, . . . ,di such non- 
negative integers that 

(a) there exist t 1 m — 2 mutually orthogonal Latin squares of order s; 

(h) s,t,t -G I, . . . ,t -G l,di, . . . ,di G T{n); 

(c) if m > 0 then also t -G I -G l,t -G I -G m G 1{n); 

(d) 1 < di < s for 1 = 1 ,... ,1. 

Then q = st -G di -G ■ ■ ■ -G di -G m G 1{n) . 

□ Consider the accompanying pseudogeometry (P, C) for the given set of mutu- 
ally orthogonal Latin squares. It contains t-Gl-Gm horizontal lines which we shall 
consider as numbered from the first one to the {t-Gl-G m)-th. Take one skew line 
and call it the distinguished vertical line. Delete s — di points from the {t -G l)-th 
horizontal line, s — di-i points from the ft -G I — l)-th one and so on in such a 
way that no point of the distinguished vertical line is removed. Then delete all 
the points of the remaining m horizontal lines except that of the distinguished 
vertical line. 
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Let P' be the set of the remaining points and (P', C) the reduced pseudoge- 
ometry. The situation is illustrated by Fig. 1 (here s = 11, t = 7, / = 1, di = 5, 
m=3, empty circles represent deleted points, full circles represent the points of 
P' , the left column of points represents the distinguished vertical line). 



•••••oooooo 

•oooooooooo 

•oooooooooo 

•oooooooooo 



Fig. 1. Reduced pseudogeometry. 



If L is a horizontal line in C such that |L n P'| > 1 then |L n P'| is either 
s or one of di,i G 1,1. on L H P' . If L is a skew line in C, and it is not the 
distinguished vertical line, then t < |LnP'| < t + l + 1 since L contains not more 
than one remaining point of each of I “shortened” horizontal lines and not more 
then one point of the distinguished vertical line lying on the last m horizontal 
lines. Finally, if L is the distinguished vertical line then |LnP'| = \L\ = t+l + m. 
Now the result follows from Theorem 1. □ 

As in [2] we obtain the following 

Corollary 3 If there exists a number k such that 2^ — l,22*+i c T(n) then 
q G X(n) for any q >2^^ — 1. 

□ The proof is exactly the same as that of [2, Corollary 8] and therefore is 
omitted. □ 

Now we shall outline the idea of the “experimental” part of our work. First 
we fixed some reasonable value (j'max and filled the array of current estimations 
of n"{2,q) for q < gmax with initial values based on the known estimations 
for prime and primary q and on the quasigroup product construction. Then 
the program repeatedly checked the numbers q G 10, (/max for the existence of 
numbers satisfying the conditions of Theorem 2 (with prime and primary s and 
/ < 3), taking into account the estimations numbers found on the previous 
repetitions. The usage of this algorithm was limited because we had to keep all 
the found numbers in the computer’s memory. When we obtained rather long 
series of consecutive numbers in T(n) for some n > 4 we passed the result to 
the second step of the algorithm which used the Theorem 2 only with prime and 
primary s and with I = 0 (we did not need to keep all the estimations now) . We 
summarize the computation results in the following 
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Theorem 4 n"{2, q) >7 for all q > 164; n*’’(2, q) > 13 for all q > 26644. 



One can conjecture that for any integer n there exists an integer qg = qo{n) 
such that n"{2,q) > n for all q > go- For n{2,q), the corresponding fact is well 
known (see e.g. [5]): n{2,q) > gio/i43^ Note however that this gives n(2,q) > 7 
only for q > « 1.2 • 10^^ and n(2, q) > 13 only for q > 13^‘*'^ « 8.5 • 10^^. 

2 Nonexistence Theorem 

In this section we will consider the question of existence of a pseudogeom- 
etry with lines whose lengths belong to a given set of integers. Recall that 
it is still an open question if there exists a 1-stable quasigroup of order q if 
q G {14,18,26,42}. On the other hand, there is no 1-stable idempotent quasi- 
groups of order q if q G (2, 3, 4, 6}. So it would be desirable, as the first step 
to the positive solution of the question by the pseudogeometry technique, to 
construct the pseudogeometries with 14, 18, 26 and 42 points without lines of 
length 2, 3,4, 6. But instead of that we have proved a negative result. First we 
give a direct proof that there is no pseudogeometry with 14, 18 or 26 points 
without lines of length 2, 3, 4, 6. For the pseudogeometries with 42 points and 
the same restrictions on lines we deduce some properties that allow us to carry 
out an exhaustive computer search which shows that they do not exist also. 
Let 7T = (P, C) be a pseudogeometry. For any point x G P denote by Cx the 
set of lines containing x, and for any x G P and L G C such that a; ^ L let 
M{x,L) = {L{x,y) : y G L}, M{x,L) = (J{L' : L' G M{x,L)}. Let m(7r) and 
M (tt) be the minimal and the maximal length of line in tt, respectively. Denote 
by £* the set of lines of length i and let ki = |£*|. We begin with some rather 
elementary observations. 

Lemma 5 // tt = (P, C) is a pseudogeometry then 

|P|(|P|-l) = P,z(z-l)A:i (2.1) 

□ It is sufficient to note that the both parts of (2.1) are equal to the number 
of ordered pairs {x,y) G P x P such that x ^ y, taking into account that every 
such pair belongs to L x L for exactly one line L G C. O 

Lemma 6 If tt = (P, P) is a pseudogeometry with even (odd) number of points 
then every point in P belongs to odd (even) number of lines of even length. 

□ Let X G P. Then P \ cc is a disjoint union of the sets L \ {a;}, where L G £x,so 

|P|-l = 27ie£,(|P|-l) = fc mod 2, 
where k is the number of lines of even length in Cx- n 
Lemma 7 For any nontrivial pseudogeometry tt = (P,C), 



(m(7r) — 1)M(7 t) -I- 1 < |P| 



( 2 . 2 ) 
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□ Let L be a line of length M (tt) . Take a point x G P \ L and note that 

\M{x,L)\ = |{x} U ( y L{x,y) \ {a;})| = 1 + SyeL\L{x,y) - 1|, 

y&L 

since the union in the previous line is a disjoint one. So |P| > \M{x,L)\ > 
1 + (m(7r) — 1)M(7 t). □ 

Note that for a projective plane the equality holds in (2.2). 

We will say for brevity that a pseudogeometry is admissible if it is nontrivial 
and does not contain a line whose length is 2, 3, 4 or 6. Now we arrive at the 
following 

Proposition 8 //tt = (P,C) is an admissible pseudogeometry, then either |P| > 
28 or \P\ G {21,25}. 

□ First suppose that m(7r) > 7. Then Lemma 7 gives |P| > 7 • 6 + 1 = 43. So we 
can suppose that m(7r) = 5. Again if M{n) > 7 then |P| > 7 • 4 + 1 = 29, and 
only the case remains when all the lines have length 5. If any two lines in tt have 
a common point then there is equality in (2.2) and so |P| = 21. Suppose there 
are two “parallel” lines, say Li and p 2 - Consider any line L that intersects Pi 
at some point x. Suppose P n P 2 = 0. Then we have a disjoint union 

{x} U (Pi \ {x}) U (P \ {x}) U (M(x,P 2 ) \ {x}) C P, 

so |P| > 1+4+4+4-5 = 29. So if |P| < 28 then for any point x G Pi and any point 
y ^ Li the line P(x, y) intersects P 2 . This means that P = (Pi \ (xj) UM(x, P 2 ) 
and |P| = 4 + 21 = 25. □ 

Proposition 8 means in particular that there is no hope to construct an 
idempotent 1-stable quasigroup of order 14, 18 or 26 using pseudogeometries. 

From this point till the end of the section we suppose that tt = (P, £) is 
an admissible pseudogeometry with 42 points. We already know from Lemma 7 
that m(7r) = 5. The following lemma gives a rough estimation of 

Lemma 9 M{tt) < 10. 

□ By virtue of Lemma 7, M(tt) <11, since 4-11 + 1 > 42. Let P be a line 
in TT of length 10 and x G P \ P. If some line in A4(x,P) contains more than 
5 (and hence more than 6) points, then as in the proof of Lemma 7 one has 
|P| > 1 + 6 + 9 • 4 = 43 > 42, a contradiction. So all the lines Po, ... , Pg have 
length 5, so |M(x,P)| = 1 + 10 • 4 = 41. This means that there exists a point 
y G P\M{x,L). So there is a disjoint union L{x,y) U (M{x,L) \ (xj) C P, so 
|P| > |P(x, y)| + 10 • 4 > 5 + 40 > 42, a contradiction again. □ 

Now the only even length of a line in P is 8, so by Lemma 6 every point in P 
belongs to 1, 3 or 5 lines of length 8 (7 lines of length 8 with one common point 
contain 50 points). 

Lemma 10 Every two lines of length 8 in it intersect each other. 
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□ Suppose Li,L 2 G C, \Li\ = IL2I = 8 and L\ C\ L 2 = 0. Consider a point 
X € P \ {Li U L 2 ). Take a line L G Cx oi length 8. Suppose first that LC\L\ 7^ 0 
and L n L2 = 0. As in the proof of Lemma 7, there is a disjoint union 

{x} U (Li \ {x}) U (L \ {x}) U (M(x, L 2 ) \ {x}) C P, 

so |P| > 1 + 2 • 7 + 4 • 8 = 47 > 42, a contradiction. If L n Li 7^ 0 and L n L2 7^ 0 
then L G M{x, L 2 ), so again we have a disjoint union 

{x} U (Li \ {x}) U (M(x, L 2 ) \ {x}) C P, 

but now one of the lines in A4(x, L 2 ) has length 8, so \M{x, i2)\{a;}| > 7-4+7 = 
35 and |P| > 1 + 7 + 35 = 43 > 42. So the only remaining possibility is that 
line L does not intersect lines Li and +2- Let = L and consider any point y 
in P \ (Li U L2 U L3). Again there is a line of length 8 that contains y and by 
the preceding argument it does not intersect lines Li, L 2 and L3. Repeating this 
argument, we have that |P| is a multiple of 8, which is wrong. □ 

Lemma 11 There is no point in tt that belongs to 5 lines of length 8. 

□ Suppose on the contrary that x G P is such a point and Li, . . . , L5 are 5 lines 
of length 8 such that x G i G 1, 5. Let M = Li. Since \M\ = 1 + 5-7 = 36 
and the set P\M is covered by disjoint sets L(x, y) \ {x}, where y spans P \ M, 
of cardinality not less then 4 each, the only possibility is that P \ M = L \ {x} 
for some line L of length 7. Take any point y € L \ {x} and consider a line K of 
length 8 in Cy. Then K ^ {L, Lq, . . . , L5}, |A1 n M| = 5 and \K n L\ = 1, so K 
contains 2 “extra points”. A contradiction. □ 

Now let us introduce some more notation. Let Ni be the set of points in P 
that belong to exactly i lines of length 8. 

Lemma 12 |A^3| = 7, |A^i| = 35, ks = 7, kg = 0. Moreover, the reduced pseu- 
dogeometry 7 t' = {N^,C) is the projective plane P^(F2). 

□ Let Hi = I A^i|, z = 1, 3. Then by Lemma 6 and Lemma 11 we have 



zzi + ZZ3 = 42. (2-3) 

Counting all the points of the lines in as different ones and taking into account 
that each point of will be counted 3 times, we obtain 

rzi + 3rz3 = 6ks- (2.4) 

Now let A C X be the diagonal: A = {{L,L) : L G £®} and consider the 

map + : X \ A ^ A^3 defined by the rule: i^(Li , L2) is the intersection point 

of the lines L\ and L2- This map is well-defined by Lemma 10. By definition of 
N3, |+“^(x)| = 6 for any x G N3. So 



6ri3 = ks{ks - 1). 



(2.5) 
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Solving the equation system (2.3) — (2.5) and using the condition m > 0 we 
obtain N 3 = 7, Ni = 35, ks = 7. Since each point in P belongs to some of 7 
lines in there is no line of length 9 in C, so kg = 0. Now consider the reduced 
pseudogeometry tt' = (N 3, £'). First check that every two different points x,y in 
Ng belong to one line of length 8 in C. Really, if (£® n Cx) H (£® n Cy) = 0 then 
three lines of C^OCx have 9 intersection points with three lines of C^OCx, which 
is impossible. So \Ng n L| < 1 for any line L € C \ £®), and so \C'\ = |£®| = 7. 
Note also that four points of Ng cannot belong to one line: every one of these 4 
points would belong to two another lines of length 8 and all these 9 lines would 
be different, which contradicts the equality ks=7. Using Lemma 5 for tt' we have 
^2 + ^3 = 2^2 + 6/C3 = 42, so ^2 = 0 and k '3 = 7. So we have finally that 

every line in C' contains 3 points and every point is contained in 3 lines, while 
the numbers of points and lines in tt' are equal to 7. So tt' is isomorphic to the 
projective plane P^(F2). □ 

The configuration described by Lemma 12 is presented in Fig. 2, where the 
lines of length 8 are drawn and the points of N 3 and N\ are marked. 




Fig. 2. Lines of length 8 in the pseudogeometry tt. 



Lemma 13 Every point of N 3 belongs to 5 lines of length 5 and to no line of 
length 1 . 

□ Let X G N3 and L G Cx \ ■ Then L \ {x} is contained in the union of four 

disjoint sets K\N 3 , K G Lx, and each of these sets has one common point 
with L. So \L\ = 5. On the other hand, each of these sets contains 5 points, so 
\CxCiL^\ = 5. □ 
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Lemma 14 Every point of N\ belongs to some line of length 1. 

□ Let X € Ni and L G Cx and L G \ Cx- Then M{x,L) = 36 and 
remaining set of 6 points cannot be the disjoint union of sets of cardinality 4. □ 

Now we pass again to the “experimental” part of our arguments. Applying 
Lemma 5 and Lemma 12 to tt we obtain the equation 

42 • 41 = 20/c 5 + 42fcy + 7-56. 

It has three solutions in non-negative integers: 

(a) fcs = 14, h = 25; 

(b) fcs = 35, kr = 15; 

(c) ^5 = 56, kj = 5. 

The case (a) is impossible since by Lemma 13 k^ > 35. In the case (b) it is 
evident that every point x G Ni belongs to 1 line of length 8 and to 4 lines of 
length 5, so it must belong to 3 lines of length 7, and that every line of length 7 
must intersect every line of length 8 in some point of A^i . If the points of L \ fVi 
are marked by numbers 0, . . . ,4 for every line L of length 8 then every line of 
length 7 may be presented as vector (oq, . . . ,ae), were at G 0,4. In other words 
the lines of length 7 must form a code of length n = 7 over an alphabet of g = 5 
elements having cardinality c = 15 and distance d = 6. Such codes seem to be 
of independent interest because their parameters lay on Plotkin boundary: 

a < • n 

q c — 1 

(see, e.g., [9, Theorem 1.1.39]). So we first asked if such codes exist. The answer 
was affirmative: 

Proposition 15 There exist codes of length n = 7 over an alphabet of q = 5 
elements having cardinality c = 15 and distance d = 6. 

□ The codes in question were constructed by a computer program. □ But the 
attempt to add the lines of length 5 failed: exhaustive search showed that it is 
impossible. 

In the case (c) it is easy to deduce from Lemma 14 that the lines of length 7 do 
not intersect each other, and that every point in Ni belongs to 4 lines of length 
5 that connect it with points of A3 and to one line of length 5 that is contained 
in Ni. Again these restrictions make the exhaustive search possible and again 
it showed that the desired configuration does not exist. We can summarize the 
results on the pseudogeometries with 42 points as follows. 

Theorem 16 Any nontrivial pseudogeometry with 42 points must contain a line 
of length 2,3,4 or 6. 
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Higher Order Differential Attack 
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Abstract. The encryption algorithm MISTY is a ’’provably secure” one 
against Linear and Differential cryptanalysis. Since the designer showed 
3 round MISTYl without FL function is provable secure, we omit FL to 
estimate the strength for Higher Order Differential Attack. This attack is 
a chosen plain text attack and uses the value of higher order differential 
of output to derive an attacking equation for sub-keys. The value depends 
on the degree of output and the degree depends on the choice of plain 
texts. We show that the effective chosen plain text and 5 round MISTYl 
without FL is attackable using 11 different 7th order differentials. And 
we show the attacks to remaining sub-keys by the determined sub-keys 
and intermediate values. Intermediate value is a constant in the process 
of encryption. As the result, we can determine all sub-key in 5 round 
MISTYl without FL. 

1 Introduction 

Linear and Differential cryptanalysis are the powerful attack to DES-like crypto- 
systems. As a counter measure, the concept of ’’provably secure” against them 
is proposed [5] [7]. The encryption algorithm MISTY, proposed by Matsui in 
1996, is a block cipher designed under such concept [1]. There are two types of 
algorithm, MISTYl and MISTY2. MISTY has F-function named FO. MISTYl 
has DES-like structure with 8 rounds FO function. The designer claimed that 3 
round MISTY 1 without FL function is enough to have provable security against 
Linear and Differential cryptanalysis. 

Since ’’provable security” is guaranteed only by FO, we analyzed modified 
MISTYl, which has no FL and reduced number of rounds, by Higher Order 
Differential Attack. This is a chosen plain text attack which uses the fact that 
the value of higher order differential of the output does not depend on sub-keys. 
The order for the attack depends on the chosen plain text and it affects the 
number of plain texts and the computational cost. 

We show the outline of the attack in Section 2 and 3. In Section 4, we show 
the effective chosen plain text which enables the attack to 5 round MISTYl 
without FL. The attack is consisted of 3 main parts. Section 4 shows an attack 
using 7th order differentials to determine 4 sub-keys in 5th round. Section 5 
shows the estimation of intermediate values. Intermediate value is a constant 
in the process of encryption, which is a function of fixed part of plain text and 
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0 " K 

C : Cipher text 64[bit] 

(i) MISTY withont FL 

Fig. 1. The modified MISTY 





(ii) Equivalent FO (iii) Equivalent FI 

(where K and Kijk denotes the eqivalent sub-keys). 



sub- keys. Section 6 shows an attack using differentials of intermediate values to 
determine remaining sub- keys. 



2 Modified MISTYl 

In the following, we discuss one type of the algorithm, MISTYl. The successful 
attack of Linear Attack or Differential attack depends on maximum linear or 
differential probability. Let p be the average probability of them for F function. 
From the theorem shown by Nyberg and Knudsen [5], the probability for 3 
round F functions equals to If the probability is low enough, they call 
such property as ’’provable secure” against Linear and Differential Attack. 

The designer showed that F function named FO has the probability with 
p < 2“®® and 3 round FO function is provable secure. Though the main part of 
security is guaranteed by FO function, the designer added the auxiliary func- 
tion FL, expecting the higher security. FO function is consisted of 3 rounds FI 
functions. FI function is consisted of two kinds of S-boxes called S7 and S9. The 
degree of S7 is 3 and the degree of S9 is 2. We denote FO function in z-th round 
as FOi, j-th FI function in FO^ as Fly. 

We attack modified MISTYl by Higher Order Differential Attack. The mod- 
ification assumes omitting FL functions and reducing number of rounds moti- 
vated by the statement of the designer that MISTYl without FL is enough to 
attain a provable security assuming more than or equals to 3 rounds. To simplify 
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(i) Flij (ii) fc-th S-box in Flij (iii)The last round of i round MISTY 

Fig. 2. (i)(ii):The input and output variables for FI^ and fc-th S-box, (iii):The 
last round of i round MISTY. 

the attacking equation, we deduce the equivalent FO function and FI function 
(Figure I). We denote equivalent sub-keys as K and Kijk- In the following, we 
use input and output variables for FI^ and fc-th S-box as shown in Figure 2(i) 
and (ii). 

3 Higher Order Differential Attack 

3.1 Higher Order Differential 

Let F{X;K) be a function : GF(2)” x GF(2)^ GF(2)'". 

Y = F{X;K), (X e GF(2)”,y e GF(2)™,iF e GF(2)®) (1) 

Let {A\,A 2 , , Ai) be a set of linear independent vectors in GF(2)” and 

be a subspace spanned by the set. We define F{X-, K) as the z-th order 
differential of F{X;K) with respect to X as follows. 

A^^^F{X;K)= F{X + A;K) (2) 

If degxF{X; K) = N, we have the following properties. Let symbol ”]d” be the 
operation which omits terms whose degree is smaller than d. 

Property 1. 

ie^AFiX, A')} = A ^ ‘JmFiX: A)]„ <*> 



Property 2. Let F{X) : GF(2)” GF(2)™. If F(")=GF(2)”, then for any fixed 

value / G GF(2)”, Z\(”)F(X + f;K) = A^^^F{X- K). 
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3.2 Attacking Equation 

Figure 2(iii) shows the last round of i round MISTYl. can be calculated 

by cipher text Cl(X), Cr{X)g GF(2)^^ and sub-key as follows. 

= FO{Cl{X); + Cr{X) (4) 

If degxH^''\X) = N, following equation holds. 

where F{-) denotes the function GF(2)^^ x GF(2)^®^(*“^) 

A(1'(*“2)) denotes the set of keys for previous {i — 2) rounds 

From equations (4) and (5), we can derive the following equation. 

{FO{Cl{X + A); A«) + Cr{X + A)} = A^^'>F{X; (6) 

Aev(^) 

If the right hand of this equation can be determined for some analytical method, 
we can use this equation (6) as the attacking equation for 



(5) 

GF(2)32. 



4 Attack of MISTYl 



4.1 Effective Chosen Plain Text 



The order for Higher Order Differential Attack depends on the chosen plain text. 
Since the order affects the number of chosen plain texts and the computational 
cost, it is important to search for the effective chosen plain text. The plain text 
can be divided into 8 sub-blocks according to the S-boxes to be inputed to. 



P={Xr,Xe,...,XuXo), Yg 



GF(2)"^, i = even 
GF(2)9, i = odd, (z = 0~7) 



(7) 



The degree of output depends on which sub-block we choose as a variable. We 
searched for the effective choice which makes the slowest increase of degree. As 
the result, the effective one is to keep all the sub-blocks fixed except right most 
7[bit] sub-block Xq- For this chosen plain text, the increase of degree by the 
formal analysis is shown in Figure 3. The symbol < i\j > denotes that the 
degree of left block is i and the right block is j . 



4.2 Attacking Equation Using 7th Order Differential 

We have 7[bit] variable for the chosen plain text P. Let’s discuss the attack using 
7th order differential. We use a subspace as follows. 

= (Ai, A2, . . . , Ar), A, = (0, 0, . . . , 1, . . . , 0)) G GF(2)64 

t (z — l)th bit 



( 8 ) 
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Fig. 3. The increase of degree by the formal analysis (To simplify the expression, 
we omit sub-keys in these figures.) 



Let H ^2 be the left 7[bit] of the output from FO3. 

-^32" = Hzi 2 + H 322 + Z 322 (9) 

From Property 1, the following holds. 

+ H322 + ^322)]t 

= Z\^’^^iL3i2]7 (10) 

Let IF(-) be the function GF(2)^ x GF(2)® GF(2)^ shown in Figure 3. 

H312 = J^{Xo + Hi 33 + K222,y22l) ( 11 ) 

Note that I221 is a constant for the chosen plain text P. As Xq spans GF(2)"^, 
from Property 2, the following holds. 

Z\(^)iJ3i2 = Z\(^).F(Ao + H 133 + K222,Y22i) 

= Z\(^).F(Ao,y22i) (12) 

From equation (10) and (12), we have 7th order differential of follows. 

= A^^^P{Xo, Y22 i)]7 (13) 

We calculated the Boolean expressions of H 312 by using the computer algebra 
software REDUGE. As the result, we found the followings. 
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Table 1. The Boolean expression of H^\ 2 - 



ho 


a:oa;ia;2a;3a:4a:5*6 + (yo + ys + V5 + ye + ye)xoXiX2X3X4,xe H + 1 


hi 


[ye + t/2 + 2/4 + y7)xoXiX2X3XAXe ^ h 1 / 52/7 + 2/52/8 + 2/62/8 + 2/6 


h2 


xoXiX2XeXAXeXe + ( 2/0 + 2/2 + 2/4 + 2/s + 2/7 + 2/8 + l)a;o2:ia:2a:3a;4a:5 H + 1 


ho 


a:o*ia:22;3a:4X5*6 + ( 2/0 + 2/3 + 2/4 + 2/6 + ys)xoX\X2X3XAXe H + 1 


/14 


( 2/0 + 2/2 + 2/3 + 2/6 + y7)X0XlX2X3XAXe H h 2/62/72/8 + 2/7 + 2/8 + 1 


hs 


a:oa;ia; 2 a; 3 a: 4 a: 5*6 + ( 2/1 + 2/6 + 2 /s + l)a;oa;ia; 2 a: 3 a: 4 a /5 H + 2/8 


he 


* 0 * 1 * 22 : 33 : 42 : 5*6 + ( 2/0 + 2/2 + 2/5 + 2/7 + 1)*0*1*2*3*4*5 H + 2/6 + 2/7 



1. The degree of H^i 2 equals to 7. 

2. The value of 7th order differential of H ^2 equals to 0x6D. 

3. The coefficients of terms whose degree is 6, are functions of elements in 7221- 

We show a part of them in Table 1. 

-^222 = {xq, ■ ■ ■ ,Xo), {X222 = -^0 + Hi 33 + K222) 

Y 221 = (j/8) • • ■ j yo), H 312 = {he, . . . , ho) 

By using = 0x6D, the following attacking equation, with respect to 

K 522 , K 521 , K 512 and K 511 (32[bit] out of 75[bit]) can be derived. 

^ FO{Cl{P + A); K522,K52i,K512, Koii)+Cr{P + A) = 0ic6I) (14) 

As we construct the attacking equation for 7 [bit] P[o 2 j the resultant equation is 
the vector equation on GF(2)’^. Note that the appropriate bit of Cl{P), Cr{P) 
and FO(-) is selected for the attacking equation (15). 

4.3 Number of Chosen Plain Texts and Computational Cost 

We adapted the algebraic method [8] for solving the attacking equation (15). 
We regard all the variable terms with respect to /C 522 , /C 521 , /C 512 and /C 511 
as the independent variables. The attacking equation has two 9 [bit] unknowns 
(/C 521 and /C 511 ) whose degree is 1, and two 7[bit] unknowns (/C 522 and /C 512 ) 
whose degree is 2. The equation can be regarded as the linear equation which 
has 2 X (9 + 7 + 7 C 2 ) = 74 unknowns. We can deduce 7 linear equations from 
one 7th order differential. To solve the equation, we need 74/7 ~ 11 different 
7th order differentials. Thus we need 2^ x 11 = 1,408 chosen plain texts. 

To solve the attacking equation, we calculate the coefficient matrix. The size 
of matrix is 74 x 74, so it needs 74 x ^ x 2^ ~ 2^"^ times FO function operations. 
If we make a brute force search by using 2^ chosen plain texts, the computational 
cost is 2^ X 2^^ = 2^®. The algebraic method is 2^^ times faster than the brute 
force search (Table 2). The computer simulation took about 0.5[s] for this attack 
(SONY NEWS5000: CPU R4400 150[Mhz], Memory 32[M]). 
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Table 2. The compalison of the algebraic method and the blute force sarch. 





Number of texts 


Computatinal cost 


CPU time 


Algebraic method 


1,408 




about 0.5[s] 


Brute force sarch 


256 


2^35 


- 



Table 3. The Boolean expression of Hs22- 



ho 


{hohi + hohi - 1 - hihs + hih4 H -1 h^hs + hike -|- + 1)xqX\X2X3X4X5 ■ ■ ■ 


hi 


{hohs + hohs -|- hohe + hih 2 + h 2 h 3 + ^ 2^4 -I- /i 2 hs -f h.3h4)xoXiX2X3X4X5 ■ ■ ■ 


h 2 


{hohi -\-ho-\- hih 2 + hi/is -|- h\h4 + / 11/15 -1 hi/ie H h h%)xoX4X2X3X4X5 ■ ■ ■ 


hs 


{hoh 2 + hoh 3 -I- hohs + hih4 H + hahe + h. 4/15 + >15 + he)xoXiX2X3X4X5 ■ ■ ■ 


h4 


{hohi + hoho - 1 - hoh 4 + hohs + hohe + ho + hihs H h he)xoXiX 2 X 3 X 4 Xs ■ ■ ■ 


hs 


{hohi + hoh 2 - 1 - hoho + hoh4 H + hohe + /i4hs -f h4ho + l)a;o2:ia:2a:3a:4a;5 • • • 


he 


(/ioh-2 + hoho + hohs + hoho + ho + / 11/12 + hihs H h l)a;o2;i®2a:3X4a:5 • • • 



5 Estimation of Intermediate Value 

5.1 Intermediate Value 

There are constants which are functions of the fixed part of plain text and sub- 
keys. We call such constants as intermediate values. Since we have 11 different 
7th order differentials mentioned before, we have 11 different constants in fixed 
part in chosen plain texts. Thus each intermediate value has 11 different values. 
In the following, we use the differentials of these to derive attacking equations 
for remaining sub-keys. Next, we estimate intermediate values using 6th order 
differentials. 



5.2 Attacking Equation Using 6th Order Differential 

By decoding the cipher texts using the derived sub- keys, we can calculate ■ 
From Figure 3, the degree of H322 will be 6. Let’s consider 6th order differential 
of equation (9). The terms whose degree is greater than 5 should be counted. 

H 32 ]e = {H312 + H322 + -^322)]6 (15) 

We calculate Boolean expressions of H322 by REDUCE. As the result, we 
found the followings. 

1. The degree of H322 equals to 6. 

2. The coefficients of the terms whose degree is 6, are functions of iL2i3- 
We show a part of them in Table 3. 

X222 = {xq, . . . ,xo), H213 = {Hq, . . . ,ho), 7/322 = (he, ... ,ho) (16) 
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From Table 1 , 3 and equation ( 15 ), hY^{i = 0 ~ 6 ) in H^2= (^ 6 ^) ■ ■ • i 
is as follows. 



6 

= Cj ■ X0---XQ + '^{[Y22i]j + [H2i3]j)x\{j} 

j=o 



(17) 



where Cj € GF( 2 ) is a coefficient. And where denotes the product 
of xo, - ■ ■ ,xq except Xj, [l22i]j denotes the coefficient of in Table 
1 , and [H2i3]j in Table 4 . 

We can calculate 6 th order differential of as follows. 



A 



(6) 

.( 6 ) 






= Ci'X 



+ [ 4 ^ 22 l]i+ [77213] j, 



\{ 1 > 



w( 6 ) _ / A 

Am “ 



. Ao 



, Aj-i,A 



i+i) ■ 



■ >^7) 
( 18 ) 



By solving this, we can determine the intermediate values of I221, H213 and one 
bit Xj of A222- Since A222 is 7 [bit], to determine all bit, we need 7 different 
6 th order differentials. One 7 th order differential can derive 7 different 6 th order 
differentials. The equation ( 18 ) has 7 [bit] and 9 [bit] unknowns whose degree is 1 
and one 7 [bit] unknown whose degree is 2 . So the equation has 7+9+7+7C2 = 44 
unknowns. Thus we can solve equation ( 18 ) by using one 7 th order differential 



6 Attack to the Remaining Sub-keys 

We can determine intermediate values A222, A221 and 77213- The attack to the 
remaining sub-keys is ; (l)Estimation of the intermediate values A212 and ^213, 
( 2 )Attack of FOi, ( 3 )Attack of FI21, ( 4 )Attack of FI22. Since we have 11 different 
7 th order differentials, fixed sub-block Xi (i yf 0 ), has some different values. 

6.1 Estimation of the Intermediate Values X212 and X213 

In FI21, following holds. 

77213 = V213 -I- V212 + T211 -I- Z212 ( 19 ) 

The value of Z212 is a constant determined by the plain text sub-block A2. 

Z212 = A2 -I- H122 ( 20 ) 

Since H122 is a constant, the value of 1 st order differential of Z212 equals to the 
value of 1 st order differential of A2. The value of H213 is known, we can calculate 
the value of 1 st order differential of ( 19 ) as follows. 

AH213 = AI213 + AI212 + AZ212 ( 21 ) 

Since F213 = S'9(A2i3), we can regard AY213 as linear equation. In the same way, 
AF212 as 2 nd order equation. The equation ( 21 ) has 7 [bit] unknown whose degree 
is 2 and 9 [bit] unknown whose degree is 1 . So the equation has 7 +9 + ^02 = 37 
unknowns. To solve this, we need 37/7 ~ 6 different 1 st order differentials. As 
the result, we can determine A212 and A213. 
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6.2 Attack of FOi 

The intermediate value A213 is calculated as follows. 

A213 = 2'212 + A ^213 + S 9 {H^^ + H ^2 + -f^21l) (22) 

Since Z212 and A213 are constants and A213 is known, we can calculate 1st order 
differential of this as follows. 

AA213 = AS9{H^^ + i/iV + K211) (23) 

We can determine sub-keys Am, A112, A121, A122 and A211 by solving this 
equation. 

The intermediate value A222 is calculated as follows. 

A222 = nV + + A222 + Ao + A4 ( 24 ) 

Since A222, Xq and X4 are knowns, we have following. 

Z\A222 = + AY^i^^ + AY^^^^ + AXo + AX4, (25) 

By solving equation (26), we can determine sub-keys A113, A123, A131, A132 and 
A133. As the result, we can determine all sub-key in FOi. So we can calculate 
the values of inputs to FO2. 

6.3 Attack of FI 21 

In FI21, A213 can be calculated as follows. 

A213 = ^211 + Z212 + K213 (26) 

Since all sub-key in FOi and A211 are determined, the values of Z212 and F211 
can be calculated. Thus we can determine sub-key A213 from equation (26). 
The following equation holds. 

A212 = A212 + -^212 ( 27 ) 

The value of 2^212 is known, we can determine sub-key A212. Since A211 is deter- 
mined in the attack of FOi, we can determine all sub-key in FI21. In the same 
way, we can attack to all sub-key in FI22. Due to the limitation of space, we 
omit the details. 

7 Conclusion 

We showed that 5 round MISTYl without FL function, which is secure against 
Linear and Differential cryptanalysis, is attackable by Higher Order Differential 
Attack using the effective chosen plain texts. Our attack is consisted of 2 attack- 
ing phases. The 1st phase is an attack using 7th order differentials to determine 
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4 sub-keys in 5th round. This attack needs 1,408 chosen plain texts and 2^^ of 
computational cost. Our computer simulation for this phase took about 0.5[s]. 
The 2nd phase is an attack using intermediate values, to determine another 15 
sub-keys. The chosen plain texts for the 1st phase are sufficient for this phase. 
After sub-keys mentioned above are determined, it is far easier to estimate the 
rest sub-keys. We conclude that at least 6 rounds is necessary for resistance 
against Higher Order Differential Attack. 
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Abstract. We introduce a new class of quantum error-correcting codes 
derived from (classical) Reed-Solomon codes over finite fields of charac- 
teristic two. Quantum circuits for encoding and decoding based on the 
discrete cyclic Fourier transform over finite fields are presented. 



1 Introduction 

During the last years it has been shown that computers taking advantage of 
quantum mechanical phenomena outperform currently used computers. The 
striking examples are integer factoring in polynomial time (see [18]) and finding 
pre-images of an n-ary Boolean function (“searching”) in time (see [12]). 

Quantum computers are not only of theoretical nature — there are several sug- 
gestions how to physically realize them (see, e. g., [6,7]). 

On the way towards building a quantum computer, one very important prob- 
lem is to stabilize quantum mechanical systems since they are very vulnera- 
ble. A theory of quantum error-correcting codes has already been established 
(see [15]). Nevertheless, the problem of how to encode and decode quantum 
error-correcting codes has hardly been addressed, yet. 

In this paper, we present the construction of quantum error-correcting codes 
based on classical Reed-Solomon (RS) codes. For RS codes, many classical de- 
coding techniques exist. RS codes can also be used in the context of erasures and 
for concatenated codes. Encoding and decoding of quantum RS codes is based on 
quantum circuits for the cyclic discrete Fourier transform over finite fields which 
are presented in Section 4, together with the quantum implementation of any 
linear transformation over finite fields. We start with some results about binary 
codes obtained from codes over extension fields, followed by a brief introduction 
to quantum computation and quantum error-correcting codes. 



2 Binary Codes from Codes over F 2 fe 

2.1 Bases of Finite Fields 

First, we recall some facts about finite fields (see, e. g., [13]). 

Any finite field of characteristic p, i. e., a finite field Fg where q = p^, is 
a vector space of dimension k over Fp. For a fixed basis B of F, over Fp, any 
element of F^ can thus be represented by a row vector of length k over Fp . To 
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stress the dependence on the choice of the basis B, we will denote this Fp vector 
space homomorphism by 

S:F, =Fp. a^B{a). (1) 

The multiplication with a fixed element a G ¥q defines an Fp-linear mapping. 
Thus it can be written as a, k x k matrix Mq^o) over Fp where B{a' ■ a) = 
B{a') ■ Ms{a). The trace of Mg (a) is independent of the choice of the basis and 
defines an Fp-linear mapping 

fc-i 

tr: Fq ^ Fp, a tr(a) := aF = tr(Me(a)) 

i=0 

(for the last equality see, e. g., [9, Satz 1.24]). 

To be able to proceed further, we recall the definition of the dual basis. Given 
a basis B = {bi, . . . ,bk) of a finite field F, over Fp, the dual basis of S is a basis 
B-^ = {b[, . . . , b'f.) with 

\/i,j:tr{bib'j) = Sij. (2) 

For any basis there exists a unique dual basis (see [13, Theorem 4.1.1]). Further- 
more, for any finite field of characteristic two there exists a self-dual basis, i. e., 
a basis B with B-^ = B (see [13, Theorem 4.3.5]). For a self-dual basis B, the 
matrix Mb{o) is symmetric. This follows from [9, Satz 1.22], where 

Me^{a) = MB{af (3) 



is shown. 

Finally, any linear transformation A G GL(n,Fpic) can be written as a linear 
transformation B{A) G GL{nk,¥p) by replacing each entry of A by MB{aij). 
Moreover, the diagram (4) is commutative, i. e., the change to the ground field 
can be done after or before the linear transformation — a fact that will be essential 
later. 



1 8 ('■) 

ip kn B(A) 

■fp kn 




2.2 Subfield Expansion of Linear Codes 

In the following, we restrict ourselves to the case p = 2, but the results are valid 
for any characteristic p > 0. 

Definition 1. Let C = [N,K,D] denote a linear code of length N, dimension 
K, and minimum distance D over the field ¥ 2 ^, and let B = {b\, . . . ,bk) he a 
basis of¥ 2 k over F 2 . Then the binary expansion of C with respect to the basis 
B, denoted by B{C), is the linear binary code C 2 = [kN ,kK,d> D] given by 



C2 



B{C) := {(c.,),_^. G F|^ 



c= G C}. 
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The following theorem relates the dual codes of a code and its binary expansion. 

Theorem 1. Let C = [N,K] be a linear code over the field F2IC and let C-^ he 
its dual. Then the dual code of the binary expansion B{C) of C with respect to 
the basis B is the binary expansion B^{C-^) of the dual code C-^ with respect to 
the dual basis B^, i. e., the following diagram is commutative: 

C — 

dual basis 

B{C) — > B^{C-^) = B{C)-^ 

Proof. Let c G C and d G C-*- be arbitrary elements of the code and its dual, 
resp. Then 

N N / k \ / ^ \ 

0 = ^ c^di = XI ( XI ) ( X 

i=l i=l \t=l J \l = l ) 

where B = (61, . . . ,bk) is a basis of F2fc over F2 and B^ = (b{,. . . , b'f.) is the 
corresponding dual basis. Taking the trace in (5) and rewriting the summation 
yields 

N k k N k 

0 = X! X! X! Cijdii = X! X! c-ijdij 

i=l j=l 1 = 1 i=l j=l 

(the last equality follows from Eq. (2)). Hence the binary expansions of the 
codewords c and d are orthogonal which proves that the binary expansion of 
C-*- is contained in B{C)-^. The theorem follows from the observation that both 
sets have elements. □ 



basis B 



Corollary 1. Let C = [N,K] he a weakly self-dual linear code over the field 
F2IC. Then the binary expansion B{C) of C with respect to a self-dual basis B is 
weakly self-dual, too. 

3 Quantum Error— Correcting Codes 

3.1 Qubits and Quantum Circuits 

In this section, we give a brief introduction to quantum computation (for a more 
comprehensive introduction see, e.g., [2,21]). 

The basic unit of quantum information, a quantum bit (or short qubit), is 
represented by the normalized linear combination 

jg) = ajO) + /3|1), where a, /? G C, jap + |/3p = 1. (6) 

Here |0) and |1) are orthonormal basis states written in Dirac notation (see [8]). 
The normalization condition in Eq. (6) stems from the fact that when extracting 
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classical information from the quantum system by a measurement, the results 
“0” and “1” occur with probability \a\^ and |/3p, resp. 

A quantum register of length n is obtained by combining n qubits modeled by 
the n-fold tensor product (C^)®". The canonical orthonormal basis of (C^)®" 
is 

B := {|6i) (g) . . . 0 \bn) =■■ \bi... &„) = \b) \ bi € {0, 1}}. 

Hence the state of an n qubit register is given by 

\ip) = ^ Cb\b), where Cb G C and ^ |cbp = 1. (7) 

be{o,i}" be{o,i}" 



All operations of a quantum computer are linear. Furthermore, in order to 
preserve the normalization condition in Eq. (7), the operations have to be uni- 
tary. Basic operations are single qubit operations and two qubit operations. A 
single qubit operation on the jth qubit is given hy U = I 23-1 (g> C /2 <g> 
where C /2 G 7/(2) is a 2 x 2 unitary matrix. Important examples for single qubit 
operations are the Hadamard transform H and the Pauli matrices ax, ay, az 





0 1 
1 0 





( 8 ) 



where = —1. The most important example for a two qubit gate is the so-called 
controlled NOT gate {CNOT) since any unitary operation on a 2"-dimensional 
space can be implemented using only single qubit operations and CNOT gates 
(see [1]). The transformation matrix of the CNOT gate is given by: 



CNOT := 



/ 1 


0 


0 




|00) |00) 


0 


1 


0 




|01) ^ |01) 


0 


0 


0 


1 




V 0 


0 


1 


0 / 


|11)^|10) 



\xi) • 1 * 1 ) 

\x2) Q |a;i©a:2) 



(9) 



The CNOT gate corresponds to the classical XOR gate since CNOT\xi)\x 2 ) = 
|a:i)|xi 0 X 2 )- (For the graphical notation on the right hand side see, e.g., [1].) 



3.2 Error Model 

In the following, we briefly summarize some results about quantum error-cor- 
recting codes. For a more comprehensive treatment, we refer to, e. g., [3,15]. 

One common assumption in the theory of quantum error-correcting codes 
is that errors are local, i.e., only a small number of qubits are disturbed when 
transmitting or storing the state of an n qubit register. The basic types of errors 
are bit-flip errors exchanging the states jO) and jl), phase-flip errors changing 
the relative phase of jO) and jl) by tt, and their combination. The bit-flip error 
corresponds to the Pauli matrix ax, the phase-flip error to az, and their combi- 
nation to ay. It is sufficient to consider only this discrete set of errors in order 
to cope with any possible local error (see [15]). The important duality of bit-flip 
errors and phase-flip errors is shown by the following lemma. 
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Lemma 1. Bit- flip and phase- flip errors are conjugated to each other by the 
Hadamard transform, i. e., HaxH~^ = cr^ and HazH~^ = ax- 

Errors operating on an n qubit system are represented by tensor products 
of Pauli matrices and identity. The weight of an error e = ei ® ® e„, where 

6i € {id, ax,ay,az} is the number of local errors e* that differ from identity. 



3.3 Quantum Codes 

Analogously to the notation C = [A^, K, d] for a classical error-correcting code 
encoding K information symbols using N code symbols and being capable of 
detecting up to d — 1 errors, a quantum error-correcting code encoding K qubits 
using N qubits is denoted by C = [[A^, AT, d]]. The code C is a 2 ^-dimensional 
subspace of the 2^-dimensional complex vector space such that any 

error of weight less than d can be detected or, equivalently, any error of weight 
less than d/2 can be corrected. 

The construction of quantum Reed-Solomon codes is based on the construc- 
tion of quantum error-correcting codes from weakly self-dual binary codes pre- 
sented in [5] and [19,20] as summarized by the following definition and theorem. 



Definition 2. Let C = [A^, K] be a weakly self-dual linear binary code, i. e., C < 

. Furthermore, let {wj | j = 1, . . . , be a system of representatives of 

the cosets jC. Then the basis states of a quantum code C = [[A^, A^— 2AT]] are 
given by 



\^j) 



— ic -I- Wj). 



(10) 



Theorem 2. Let d be the minimum distance of the dual code C-^ in Definition 2. 
Then the corresponding quantum code is capable of detecting up to d — 1 errors 
or, equivalently, is capable of correcting up to (d — l)/2 errors. 

Proof. (Sketch) A general state of the quantum code is a linear combination of 
the states in Eq. (10), i.e., 

1^) = + = X] /3c|c). (11) 

i j Vl^lcec ceC-L 

A combination of bit-flip and phase-flip errors can be written as 

e = (g) . . . (g) (crx'’’”cr®*’-"), (12) 

where ej, and Sp are binary vectors. The effect of this error on the state (11) is 
e|^) = et). (13) 

ceC-L 

Computing the syndrome with respect to the binary code C'*" using auxiliary 
qubits, we obtain the state 

Pc{-lT'''’’\c + eb)\s{c + Bb)). 

ceC-L 



(14) 
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As the syndrome s(c + et,) depends only on the error et, the state (14) is a 
tensor product and we can measure the syndrome without disturbing the first 
part of the quantum register. Using a classical decoding algorithm for the code 
C'^, the error vector et, is computed from the measured syndrome s(et,). For 
each non-zero position of et,, a ax gate is applied to correct the error. 

From Lemma 1 follows that the Hadamard transform exchanges the role of 
et, and e^ in Eq. (12). Furthermore, computing the Hadamard transform of the 
states in (10) yields 

^ (-l)^'"’^|c). (15) 

V I ceC-L 

Hence, the Hadamard transform changes the state (13) into 

Y. 7c(-l)"'"1c + e,). 

ceC-L 

The error vector Sp can be determined as before. □ 

The general outline of decoding is shown in Fig. 1. 



(erroneous) 
encoded state 



auxiliary qubits 




Fig. 1. General decoding scheme for a quantum error-correcting code con- 
structed from a weakly self-dual binary code. 

Before we will be ready to present quantum Reed-Solomon codes, we need 
to show how to implement a discrete Fourier transform over a finite field on a 
quantum computer. 

4 Quantum Implementation of the Cyclic DFT over F 2 fc 
Recall from Section 3.1 that the state of an n qubit system can be written as 
\ip) = Y, Ca:|a;), where G C and Y. = 1- (16) 

Hence any invertible linear transformation A G GL(n,¥2) on the binary vector 
space FJ* induces a linear transformation Q{A) G GL(2",C) on the complex 
vector space = (C^)®". The transformation Q{A) permutes the basis states 
\x) according to Q{A) : |a;) i— > \xA). In the following we will show how this 
transformation can be implemented efficiently using only CNOT gates. 







Quantum Reed-Solomon Codes 237 



Theorem 3. Let it G Sn be a permutation and let P G GL(n,F 2 ) he the eor- 
responding permutation matrix acting on the binary vector space Then the 
quantum transformation Q(P) G GL(2",C) defined by Q{P) : \x) \xP) is 
a permutation matrix permuting the n tensor factors of the complex vector 
space = (C^)®". It can be implemented using at most 3(n — 1) CNOT 
gates. 

Proof. Any permutation tt G Sn on n letters can be written as product of at 
most n — 1 transpositions. Each transposition (i,j) can be implemented by a 
quantum circuit with three CNOT gates, see Fig. 2. □ 



^ « n 






(Vi) = : ^ 


A i 


j ' 

. r 




c 


u ^ 


V 



Fig. 2. Implementing a transposition of two qubits using three CNOT gates. 



Theorem 4. Let A G GT(n, F 2 ) be an invertible linear mapping on the binary 
vector space F|*. Then the quantum transformation Q{A) G GL(2",C) defined 
by Q{A) : |a;) \xA) is a permutation matrix acting on the complex vector 
space . It can he implemented using at most n{n — 1) + 3(n — 1) CNOT 
gates. 

Proof. Any matrix A G GL(n,F 2 ) can be decomposed as A = P ■ L ■ (7, where 
P is a permutation matrix and L (resp. U) is a lower (upper) triangular matrix. 
By Theorem 3 we need at most 3(n — 1) CNOT gates for the implementation 
of Q{P). 

For the implementation of the lower diagonal matrix L, we use the factor- 
ization L = L\ ■ . . . ■ Ln, where Li is almost an identity matrix, but the tth row 
equals the zth row of L. Hence multiplication of a binary vector x with Li is 
given by 



xLi — -\~ XiLi\y . • . , Xi— 1 “t“ XiLi^i—i , Xi , , . . . , Xn') ^ 

i.e., the jth position of x is inverted iff both Xi and Lij are equal to one. This 
translates into a sequence of at most i—1 CNOT gates with control qubit i and 
target qubit j whenever Lij equals one. In total, the implementation of Q{L) 
needs at most n{n— l)/2 CNOT gates. The quantum transformation Q{U) can 
be implemented similarly. □ 

The quantum implementation of linear mappings over an extension field F 2 fc 
can be reduced to implementing linear mappings over F 2 . First, we fix a basis 
B of F 2 fc. By extending the homomorphism B given in Eq. (1) we obtain a 
homomorphism F”*, ^ F|^". Vectors v € F”*, are mapped to binary vectors of 
length kn represented by kn qubits. Similarly to Eq. (16), we get 

IV') = X! '^”1'*^)= X! 0.v\B{vi)) (g) . . .(g)\B{Vn)). 

2 « 2 « 



(17) 
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In this representation, a linear mapping A € GL(n, F 2 k) corresponds to a linear 
mapping B(A) € GL(nk,F 2 ) (see Eq. (4)). 

In the context of quantum Reed-Solomon codes, we will use the cyclic discrete 
Fourier transform over F 2 * which can be implemented efficiently as a quantum 
circuit. 

Theorem 5. Let n he a divisor o/2^ — 1 and let a G F 2 IC he an element of order 
n. Then the cyclic DFT of length n over the field F 2 fc, given hy the matrix 

DFT = (18) 

can he implemented on a quantum computer using 0{kfn^) GNOT gates. 

Proof. The condition n|(2^ — 1) ensures that the field F 2 IC contains a primitive 
nth root of unity a. Thus, we have DFT G GL{n, F 2 fc). Fixing a basis B of F 2 fe, we 
obtain a linear transformation ,B(DFT) G GL{nk, F 2 ) which can be implemented 
using 0{kfn^) GNOT gates using Theorem 4. □ 

5 Quantum Reed— Solomon Codes 

5.1 Definition of Quantum Reed Solomon Codes 

First, we recall the definition of Reed-Solomon codes (see [16, Fig. 10.6]). 

Definition 3. A (classical) Reed-Solomon (RS) code of length N = 2^ — 1 over 
the field F 2 fc is a cyclic code with generator polynomial 

g{X) = {X- a'’)(X - 0 '’+^) ... (X - 

where a is a primitive element 0 /F 2 IC, i. e., an element of order 2^^ — 1 = N. 
The dimension of the code is K = N — 5+1 and the minimum distance is 6 . 

Alternatively, an RS code can be described by the spectrum with respect to the 
cyclic discrete Fourier transform of length N over F 2 *, see Eq. (18). For any 
vector t) G FJ^, the spectrum is defined by 

9:=«.DFT=(^(a*)).^^_^_^, 

where v{X) = ■ Then for any codeword c of an RS code, c has 5—1 

consecutive (possibly cyclically wrapped around) zeros starting at position b. 
Fixing the zeros in the spectrum, all codewords can be obtained by the inverse 
Fourier transform, i. e., the set of codewords is given by 

C={v- DFT-^ I V G F 2 ^ , vb = vb+i = ... = Vb+ 5-1 = 0} , (19) 

where the indices are computed modulo N . 

Lemma 2. For 6 = 0 and 5 > N/2 + 1, RS codes are weakly self-dual. 
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Proof. The generator polynomial oi C is g{X) = {X — \){X — a) . . . {X — ^). 

The generator polynomial of the dual code C'^ is the reciprocal polynomial of 
{X^ -l)/g{X),i.e., 

g^{X) = {X- - a~^) ...{X- 

= {X- a^){X -a^)...{X- 

For 6 > N/2 + 1, iV — 5 + 1 <5 — 2. Thus g{X)-^ is a divisor of g{X) which 
proves C < C-^. □ 

The relation between the spectra of an RS code C and its dual is illustrated in 
Fig. 3. The spectrum of any codeword c G C is zero at the first 5—1 positions, 

0 1 ■ ■ , 5-2 

I 0 I 0 I 0 I I 0 I * I I * I spectrum c of c G C 

2C 

I * I 0 I I 0 I * I 1*1*1 spectrum c of c G C~^ 

N-5 + 1 5-2 

Fig. 3. Relation between the spectra of a Reed-Solomon code C and its dual. 
Positions taking arbitrary values (marked with *) and positions being zero are 
interchanged. 

whereas the spectrum of any codeword c' G may take any value at the 
corresponding positions, the first one and the last 5 — 2 positions. In contrast, 
the last X — S+1 positions of the spectrum of c S C are arbitrary, and positions 
1 to — 5 + 1 in the spectrum of c' G C-*- are zero. 

Combining Lemma 2 and Corollary 1, we are ready to define quantum Reed- 
Solomon codes. 

Definition 4. Let C = [iV, AT, 5] where N = 2^ — 1, K = N — 6+1, and 
5 > N/2 + 1 6e o Reed-Solomon eode over F 2 * (with b = 0). Furthermore, let 
B he a self-dual basis o/F 2 fe over F 2 . Then the quantum Reed-Solomon (QTZS) 
code is the quantum error-eorrecting eode C of length kN derived from the weakly 
self-dual binary code B{C) according to Definition 2. 

The parameters of the QTZS code are given by the following theorem. 

Theorem 6. The QTZS code C of Definition / encodes k{N — 2K) qubits using 
kN qubits. It is able to detect at least up to K errors, i. e., the parameters are 
C = k{N -2K),d> K +1]]. 

Proof. The weakly self-dual binary code B{C) has length kN and dimension kK. 
Hence, by Definition 2 the corresponding quantum code encodes kN — 2kK = 
k{N — 2K) qubits. The dual code B{C^) has dimension k{5 — 1) and minimum 
distance d> K + 1. From Theorem 2 follows that the QTZS code can detect up 
to d — 1> K errors. □ 
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5.2 Encoding Qnantum Reed Solomon Codes 

Encoding of QTZS codes is based on the quantum version of the cyclic discrete 
Fourier transform over F 2 fe presented in Section 4. In the sequel, let C be an RS 
code over F 2 ^ and let ,8 be a self-dual basis of F 2 * over F 2 . Furthermore, we fix 
a primitive element a G F 2 fc. 

Theorem 7. LetC = [[kN,k{N-2K),d > K]] where N = 2^ -I, K = N-6+1, 
and S > N/2 + 1 &e a quantum Reed-Solomon code constructed from the Reed- 
Solomon code C = [N,K,S] over¥ 2 k. The transformation 

E = Q(8(DFT-^)) • 

operating on states of the form 

1 ^ 1 ) . . . 0 \4>k) |0) (g) . . . (g) |0) (g) \(pk+i) (g . . . (g \(pk(N-2K)) (g |0) (g . . . g) |0) 

k kK k(N-2K-l) kK 



is an encoder for the QTZS code. The corresponding quantum circuit is shown in 
Fig. 4 . 




-I 



k qubits 
kK qubits 

k{N — 2K — 1) qubits 



kK qubits 



Fig. 4. Encoder for a quantum Reed-Solomon code. 



Proof. Similarly to Eq. (10), any basis state of the QTZS code can be written as 



\^j) 






■Wj)), 



(20) 



where the coset representatives Wj G C-*- will be specified later. The first 5—1 
positions of the spectrum of c are zero. From Eq. (19), the other positions may 
take any value. Thus computing the Fourier transform of the state (20) yields 



Q(8(DFT))|V^,-) 



1 

~7W\ 









5-1 



..,iK) + B{wj)). 



Without loss of generality, the last K positions of £5} can be chosen to be zero. 
Hence applying the Hadamard transform to the last kK qubits yields 

^ . Q(^(DFT))|V',) = = \B{wj)). 
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Furthermore, positions i = oi Wj are zero, too, since Wj G C-*- (see 

Fig. 3). For any set of values for the remaining positions, we get a different coset 
of C in C-L. □ 



5.3 Decoding Quantum Reed Solomon Codes 

Decoding procedures for quantum Reed-Solomon codes follow the scheme of 
Fig. 1. The syndrome of a vector v G are positions i = of the 

spectrum v of v which is obtained by computing the DFT of v. This syndrome, 
indicating bit-flip-errors, is “copied” to kK auxiliary qubits using CNOT gates. 
Computing the inverse Fourier transform DFT“^ returns to the original basis. 
After a Hadamard transform, the same circuit is used to compute the syndrome 
of the phase-flip errors. The whole quantum circuit is shown in Fig. 5. Both the 



(erroneous) 
encoded < 
state 



kK qubits 



kK qubits 




k qubits 
kK qubits 

I k{N - K - 1) 

I qubits 

J 

} syndrome of 
bit-flip 
errors 

} syndrome of 
phase— flip 
errors 



Fig. 5. Computation of the syndrome for a quantum Reed-Solomon code. 



syndrome of bit-flip errors and the syndrome of phase-flip errors are measured 
yielding classical syndrome vectors. Then the most likely positions of errors are 
computed using a classical algorithm, e.g., the Berlekamp-Massey algorithm or 
the Euclidean algorithm (see [16]). 

The quantum circuit in Fig. 5 can be simplified using the following theorem. 



Theorem 8. Let DFT denote the cyclic discrete Fourier transform of length n 
over the field F 2 * and he B a self-dual basis of¥ 2 k over F 2 . Then the following 
identities hold: 

Q(S(DFT-h) • • Q(S(DFT)) = (21) 

= Q{B{tt)) • 

where tt is the permutation x —x mod n. Using the factorization on the right 
hand side of Eq. (21) instead of the factorization on the left hand side for the 
implementation as a quantum circuit reduces the complexity from 0{k‘^n^) to 
0{kn). 
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Proof. Let D := yB(DFT) denote the binary matrix obtained by replacing each 
entry of DFT by For a self-dual basis, is symmetric (see 

Eq. (3)), and the Fourier matrix is symmetric, too. Hence the matrix D is also 
symmetric. Using Dirac notation, the matrices read 

Q{D) = ^ \xD){x\ = ^ \x){xD-^\ and ^ (-l)-«|a:)(y|. 

xerf’' xerf'^ x,ye¥f”- 



Multiplying the matrices results in 



= E E 



S (-!)• 



'\uD 



{u\x) {y\v){vD ^1 

=S.u-u 



x.yerf'^ x,ye¥f”- 



The inner product of xD and yD is the same as the inner product of xDD^ and 
y. Since D is symmetric, DD'^ = = ,8(DFT^) = Finally, we obtain 

= ^ {-iry\x){yB{n)\= Y (-l)"'^l*)(y| E l^)(^^WI 

a;,yeF2'‘" x.ye¥f'^ ve¥f’' 

= Y {-ir^\^B{n)){y\= Y \^b{7t)){v\ y 

x,ye¥f’' ve¥f^ x,ye¥f’' 

□ 



From Eq. (21) in the preceding theorem it follows that 

Q(,8(DFT~^)) • = Q{B{tt)) ■ iL®'=”Q(,B(DFT-i)). (22) 



Using the identities (21) and (22), and conjugating the CNOT gates by the 
permutation of qubits Q(yB(7r)), we obtain the simplified quantum circuit shown 
in Fig. 6. 



6 Example 

We construct a quantum Reed-Solomon code from an RS code over the field Fg. 
We choose S = 5 and obtain an RS code C = [7, 3, 5] with generator polynomial 

g{X) = {X- a^){X - a^){X - a^){X - a^), 

where a is a primitive element of Fg fulfilling + a + 1 = 0. The dual code 
C-*- = [7, 4, 4] is generated by 

g^{X) = {X- a~‘^){X - a~^){X - a"®) = {X - a^){X - a^){X - a^). 

^From Theorem 6, the resulting QTZS code has parameters C = [[21, 3,d> 4]]. 

As self-dual basis of Fg we choose B = (o;^,a®,a®). The binary expansions 
of C and C-*- yield binary codes C 2 = B{C) = [21,9,8] and Cf- = B{C-^) = 
[21, 12,5]. Thus the QTZS code has parameters C = [[21,3,5]]. 
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j qubits 
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bit— flip 
errors 

} syndrome of 
phase— flip 
errors 



Fig. 6. Simplified quantum circuit for computing the syndrome for a QJZS code. 



7 Conclusion 

Most quantum error-correcting codes known so far are based on classical binary 
codes or codes over GF{4) = F 22 (see [4]). In this paper, we have demonstrated 
how codes over extension fields of higher degree can be used. They might prove 
useful, e. g., for concatenated coding. 

The spectral techniques for encoding and decoding presented in this paper 
do not only apply to Reed-Solomon codes, but in general to all cyclic codes. The 
main advantage of Reed-Solomon codes is that no field extension is necessary. 
The same is true for all BCH codes of length n over the field F 2 IC where n|2* — 1. 
In addition to the spectral techniques, cyclic codes provide a great variety of 
encoding/decoding principles, e. g., based on linear shift registers that can be 
translated into quantum algorithms (see [10]). 

The quantum implementation of linear mappings over finite fields presented 
in Section 4 enlarges the set of efficient quantum subroutines. In contrast, the 
transforms used in most quantum algorithms — such as cyclic and generalized 
Fourier transforms — are defined over the complex field (see, e. g., [17]). 

It has to be investigated how efficient fully quantum algorithms for error- 
correction can be obtained, e. g., using quantum versions of the Berlekamp- 
Massey algorithm or of the Euclidean algorithm. 
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Abstract. The capacity cj^l of a 3-dimensional (0, 1) runlength constrained 
channel is shown to satisfy 0.522501741838 < < 0.526880847825. 



1 Introduction 



A binary sequence satisfies a 1-dimensional {d, k) runlength constraint if there are at 
most k zeros in a row, and between every two consecutive ones there are at least d 
zeros. An n-dimensional binary array is said to satisfy a {d, k) runlength constraint, if 
it satisfies the 1-dimensional {d, k) runlength constraint along every direction parallel 
to a coordinate axis. Such an array is called valid. The number of valid n-dimensional 
arrays of size mi x m 2 x . . . x m„ is denoted by and the corresponding 

capacity is defined as 



Mn) _ 
^d.k — 



lim 



loff 



mi ,m2,...m„^oo mi7Tl2 ' 



in) in) 

By exchanging the roles of 0 and 1 it can be seen that Cg 1 = C*! 00 all n > 1. 
A simple proof of the existence of the 2-dimensional (d, k) capacities can be found in 
[1], and the proof can be generalized to n-dimensions. 

It is known (e.g. see [2]) that the 1-dimensional (0, 1) -constrained capacity is the 
logarithm of the golden ratio, i.e. 



= log2 = 0.694242 . . . 

and in [3] very close upper and lower bounds were given for the 2-dimensional (0, 1)- 
constrained capacity. The bounds in [3] were calculated with greater precision in [4] and 
are further slightly improved here by us (see Remark section at end for more details), 
now agreeing in 9 decimal positions: 

0.587891161775 < < 0.587891161868 . (1) 

(2) 

A lower bound of Cg { > 0.5831 was obtained in [5] by using an implementable encod- 

( 2 ) 

ing procedure known as “bit-stuffing”. The known bounds on Cg / have played a useful 
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role in [1] for obtaining bounds on other (d, /c)-constraints in two dimensions. The 
3-dimensional (0, 1) -constrained bounds given in the present paper can play a similar 
role for obtaining different 3-dimensional bounds, and are also of theoretical interest. In 
fact, a recent tutorial paper [6] discusses an interesting connection between run length 
constrained capacities in more than one dimension and crossword puzzles (based on 
work of Shannon from 1948). In the present paper we consider the 3-dimensional (0, 1) 
constraint, and by extending ideas from [3] our main result is to derive (in Sections 2 
and 3) the following bounds on the 3-dimensional (0, 1) capacity. 

Theorem 1 

0.522501741838 < < 0.526880847825 



It is assumed henceforth in this paper that d = 0 and k = 1. Two valid mi x 
7712 rectangles can be put next to each other in 3 dimensions without violating the 3- 
dimensional (0, 1) constraint if they have no two zeros in the same positions. Define a 
transfer matrix to be an x binary matrix, such that the rows 

and columns are indexed by the valid 2-dimensional mi x m 2 patterns, and an entry of 
Tmi,m 2 is 1 if and only if the corresponding two rectangles can be placed next to each 
other in 3 dimensions without violating the (0, 1) constraint. Then, 

_ 1 ' . rpm^ — l -I _ -I ' . rpm 2 — l 1 _ 1 ' . rpmi — l I 



where 1 is the all ones column vector and prime denotes transpose. The matrix Tmi,m ,2 
meets the conditions of the Perron-Frobenius theorem [7], since it has nonnegative real 
elements and is irreducible (since the all one’s rectangle can be placed next to any 
valid rectangle without violating the (0, 1) constraint). Therefore the largest magnitude 
eigenvalue Ami, m 2 , of Tmi.m 2 > is positive, real, and has multiplicity one. This implies 
that 

lim ('/V(OT) U/m3 _ A 

mil ) — Jlmi,m2i 

7713—^00 

and 



^(3) 

'-^0,1 ~ 



lim 



log- 

,7712, m3 



mi,m2,m3^oo 17111712171 ^ 

log2 



= lim 

mi ,m2— ^oo 

= lim 



77111712 



^*^§2 -^rni ,71 



mi,m2^oo 17111712 

log2 limm,2— j-oo ^7711 ,m2 

mi^oo iTli 

^^§2 ^mi 



= lim 



mi^oo rui 



( 2 ) 



where Ami = lim,„ 2 ^oo 2lm™m2- The quantities and can be 

viewed as capacities corresponding to 3 -dimensional arrays with two fixed sides (lengths 
mi and m 2 ), and one fixed side (length mi), respectively. 
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Upper and lower bounds on the 3 -dimensional capacity can be computed directly 
from the inequalities (similar to the 2-dimensional case, as noted in [4]) 

log2 2lmi,m2 ^ ^ mi ,m2 

(mi -I- 1) (m2 -I- 1) ~ “ mim2 

but these do not yield particularly tight bounds for values of m i and m 2 that result in 
reasonable space and time complexities (e.g. Table 1 shows that the eigenvalues Ami,m 2 
correspond to matrices with more than 40 million elements when roughly m im 2 > 20). 
The upper and lower capacity bounds derived in this paper agree to within ±0.002 and 
were computed using less than 100 Mbytes of computer memory. 



C3) 

2 Lower Bound on CqJ 



(3) 

To derive a lower bound on Cg { we generalize a method of Calkin and Wilf [3]. Since 
Tmi,m 2 is ^ symmetric matrix, the Courant-Fischer Minimax Theorem [8, pg. 394] 
implies that 



AP > 

mi, m 2 — 



x' • X 

'^mi,m2 



(3) 



for any nonzero vector x and any integer p > 0. Choosing x = 1 for any integer 

q > 0 gives 



AP^ 



> 



^ . rjn7rL2 



-1 



mi,p+2g+l-‘ 



Thus, 



2P^oA = f lim ] = lim ( lim 



mi, m 2 — - , rp2q 

J- ■ ^ mi ,m2 J- 



P 



^ j^m2 



-1 



(4) 



mi ,2 q'+1 



mi ,m2— ^00 



> lim 

mi — >-oo 



mi — »-oo \ m2 — *-oo 

1/™1 ^1/mi 



1/mi 



A \ lim a 

^^mi,p+2q+l \ _ ^^rni,p+2q+l _ jVp+2q+l 



lim 



and therefore for any odd integer r > 1 and any integer z > r, 



A^ 



2q+l 



(5) 



p^(3) ^ 
'-"0,1 ± 



1 



z — r 



log. I ± I . 



( 6 ) 



This lower bound on is analogous to a 2-dimensional bound in [3], but Az and 
Ar are not eigenvalues associated with transfer matrices of 2-dimensional arrays here, 
and cannot easily be computed as in the 2-dimensional case. Instead, we obtain a lower 
bound on Az and an upper bound on Ar- From (4) and (5) a lower bound on Az is 



yl, = lim > lim 



1 / . 707712 - 1-1 

-^2,7! -1 



l/((?7-«)m2) 



7772 ^00 - ^2^00 I 1' . TTI? “ ^ 1 



Az 



Az 



1 / (l? — u) 
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where u is an arbitrary positive odd integer, v > u, and and are the largest 
eigenvalues of the transfer matrices and Tz.u, respectively. 

To find an upper bound on the quantity 21^ for a given r, we apply a modified version 
of a method in [3]. We say that a binary matrix satisfies the (0,1) cylindrical constraint 
ifit satisfies the usual 2-dimensional (0, 1) constraint after joining its leftmost column to 
its rightmost column (i.e. the left and right columns can be put next to each other without 
violating the (0, 1) constraint). A binary matrix satisfies the (0, 1) toroidal constraint 
if it satisfies the usual 2-dimensional (0, 1) constraint after both joining its leftmost 
column to its rightmost column, and its top row to its bottom row. 

Proposition 1 Let s be a positive even integer and let T^i ,m 2 transfer matrix 

whose rows and columns are indexed by all (0, l)-constrained mi x m 2 rectangles. Let 
Bnii.s denote the transfer matrix whose rows and columns are indexed by all cylindri- 
cally (0, l)-constrained m\ x s rectangles. Then, 

TraceK^,„J = l' • 




s 

Fig. 1. Cylindrically (0, 1) -constrained m\ x s rectangles used to build cylindric m\ x 
m 2 X s arrays. 



For every positive integer mi and m 2 , and every even positive integer s, the ma- 
trix „J 2 has nonnegative eigenvalues and thus any one of its eigenvalues is upper 
bounded by its trace. Hence, 

< Trace = (l' • (7) 

which gives the following upper bound on Ay. 
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where ^r,s is the largest eigenvalue of Br,s (note that Br,s satisfies the Perron-Frobenius 
theorem for the same reasons as for in Section 1). 

(3) 

The lower bound on Cq { in (6) can now be written as 



^(3) ^ 
^0,1 — 



z — r 



■l0g2 






V 



^1/s 

Sr,s 



r and u odd, s even 
z > r > 1 
V > u > 1 
s > 2 



(9) 



To obtain the best possible lower bound, the right hand side of (9) should be maximized 
over all acceptable choices of r, z, u, v, and s, subject to the numerical computability 
of the quantities Az^u, and ^r,s- Table 1 shows the largest eigenvalues of various 
transfer matrices which were numerically computable. From this table, the best pa- 
rameters we could find for the lower bound in (9) on the capacity were r = 3, 2 = 4, 
u = 5, V = 6, and s = 10, yielding 



a 



(3) 

0,1 



> 



4-3 



logs 



9346.35893701 

2102.73425568 

(80481.0598379)1/1° 



> 0.522501741838. 



(3) 

3 Upper Bound on Cq^{ 

Proposition 2 Let si and S 2 be positive even integers and let B*^ denote the transfer 
matrix whose rows and columns are indexed by all toroidally (0, l)-constrained si x S 2 
rectangles. If largest eigenvalue of then log 2 

Note that i? 2 ,s 2 = ^ 2 ,S 2 i^us ^ 2 ,S 2 = ^ 2 , 82 - parameters we were able 

to find (from Table 1) were si = 4 and S 2 = 6, and the resulting eigenvalue gave the 
following upper bound: 

C^oy < ^ logs 6405.69924332 < 0.526880847825. 



4 Remark 

Direct computation of eigenvalues using standard linear algebra algorithms generally 
requires the storage of an entire matrix. This severely restricts the matrix sizes allow- 
able, due to memory constraints on computers. By exploiting the fact that our matrices 
are all binary, symmetric, and easily computable, we were able to obtain the largest 
eigenvalues of much larger matrices. Specifically, the eigenvalues used to obtain the 
capacity bounds in Theorem 1 were computed using the “power method” [ 8 , pg. 406]. 
Similarly, we obtained the upper bound in ( 1) with the power method (computing 21 1 , 21 , 
^ 1 , 23 , and ^ 1 , 24 ). Originally these bounds were computed in [3] as 0.587891161 < 
Cg 1 < 0.588339078 (computing t 1 i,i 3 , t 1 i,i 5 , and and were later improved in [4] 

(computing yli,i 3 , Til, 14 , and ^, 14 ) to 0.587891161775 < < 0.587891494943. 

The lower bound in (1) is from [4]. 
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Table 1. Largest eigenvalues oiTa,b, Ba,b, and ^ are ^a,b, and f,. 



a 


b 


-^a,b 


rows of To, 6 


^a,b 


rows of Bo, 6 


Ca.b 


rows of B* 5 




1 


1.61803398875 


2 












2 


2.41421356237 


3 


2.41421356237 


3 








E 


3.63138126040 


5 












E 


5.45770539597 


8 


5.15632517466 


7 








E 


8.20325919376 


13 












E 


12.3298822153 


21 


11.5517095660 


18 








7 


18.5324073775 


34 












8 


27.8550990963 


55 


26.0579860919 


47 








E 


41.8675533183 


89 














62.9289457252 


144 


58.8519350815 


123 








11 


94.5852312050 


233 












m 


142.166150393 


377 


132.947794048 


322 








E 


213.682559741 


610 












14 


321.175161677 


987 


300.345852027 


843 








E 


482.741710897 


1597 












E 


725.584002895 


2584 


678.525669346 


2207 








E 


1090.58764423 


4181 












18 


1639.20566742 


6765 


1532.89283597 


5778 








E 


2463.80493521 


10946 














3703.21728345 


17711 


3463.03987027 


15127 








E 


5566.11363689 


28657 














8366.13642876 


46368 


7823.53857819 


39603 










12574.7053170 


75025 














18900.3867144 


121393 


17674.5747630 


103682 








E 


5.15632517466 


7 


5.15632517466 


7 


5.15632517466 


7 




E 


11.1103016575 


17 












E 


23.9250625386 


41 


21.9287654025 


35 


21.9287654025 


35 




5 


51.5229210280 


99 












6 


110.954925971 


239 


100.236549238 


199 


100.236549239 


199 




7 


238.942175857 


577 












8 


514.563569622 


1393 


463.203410887 


1155 


463.203410887 


1155 




E 


1108.11608218 


3363 














2386.33538059 


8119 


2146.04060032 


6727 


2146.04060032 


6727 




11 


5138.98917320 


19601 












E 


11066.8474924 


47312 


9949.63685703 


39203 


9949.63685703 


39203 


3 


E 


34.4037405361 


63 












E 


106.439377528 


227 


94.2548937790 


181 








E 


329.331697608 


827 












E 


1018.97101980 


2999 


884.498791440 


2309 








7 


3152.75734322 


10897 












8 


9754.81971205 


39561 


8421.60680806 


30277 








E 


30181.9963196 


143677 














93384.9044989 


521721 


80481.0598378 


398857 






4 


4 


473.069084944 


1234 


404.943621498 


933 


355.525781764 


743 




5 


2102.73425567 


6743 












6 


9346.35893702 


36787 


7799.87080772 


26660 


6405.69924332 


18995 
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Abstract. We investigate general properties of rectangular codes. The 
class of rectangular codes includes all linear, group, and many nongroup 
codes. We define a basis of a rectangular code. This basis gives a universal 
description of a rectangular code. 

In this paper the rectangular algebra is defined. We show that all bases of 
a t-rectangular code have the same cardinality. Bounds on the cardinality 
of a basis of a rectangular code are given. 

1 Introduction 

A block code C is a set of words c = (ci, . . . c„) of length n over an alphabet Q = 
{0, 1, . . . , <7 — 1}. Given t G 1, n — 1, split every codeword c into the head (past) 
p = (ci, . . . Ct) and the tail (future) / = (ct+i, . . . , c„), i.e., c is the concatenation 
of the head p and the tail f.c = pf. A set C C Q" is called t-rectangular if the 
following implication is true [1] (in [2] such a set was called t-separable): 

Plfl,Plf2,P2fl € C ^P2f2&C. (1) 

A set C C Q" is called rectangular if it is t-rectangular for each t. 

All group codes (and hence all linear codes) are rectangular. Many famous 
nonlinear codes are also rectangular. This includes Hadamard, Levenshtein, 
Delsarte-Goethals codes (and hence Kerdock and Nordstrom-Robinson codes) 
[3], Goethals codes (and hence Preparata codes) [4]. All codewords of a linear 
block code having fixed Hamming weight form a rectangular set. 

The binary code C = {(00), (01), (10)} is the simplest example of a non- 
rectangular code. As an example of a rectangular code consider the binary code 
C = 1(0000), (0011), (0101), (1000), (1011), (1101)}. The minimal trellis of the 
code C is shown in Fig. 1. There is a one to one correspondence between code- 
words of the code C and paths in the trellis. A trellis is called minimal if it has 
in each depth the minimum number of vertices among all code trellises of the 
given code. 

* The work was supported by Russian Fundamental Research Foundation (project No 
99-01-00840) and by Deutsche Forschungs Gemeinschaft (Germany). 
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Rectangular codes have the following nice property. The minimal trellis of a 
rectangular code is unique, biproper [1], and minimizes the number of vertices 
\V\ (by definition) the number |i?| of edges, and the cycle rank \E\ — |R| + 1. As 
a result the Viterbi decoding complexity of a rectangular code is minimum when 
using the minimal trellis of the code. In addition, the minimal code trellis gives 
a universal compact representation of a rectangular code. If a rectangular code 
has no additional structure then perhaps the minimal code trellis is the only 
known compact description of the code. We present another universal compact 
description of a rectangular set using a suggested idea of rectangular basis. 

Given an arbitrary block code C, a rectangular set that includes C and has 
the minimum cardinality is called a rectangular closure of C and is denoted 
by [C]. A rectangular closure [C] is unique. We say that a set G generates a 
rectangular set S' (G is a generating set for S) if [G] = S. A set G is called 
independent if for any g £ G g ^ [G \ g] . An independent set B generating a 
rectangular set S is called a basis of the rectangular set S. 

In [8] we obtained the following results. Given a rectangular set S, the Gol- 
oring algorithm was proposed to design a basis B having cardinality 

\B\ = \E\-\V\+2, (2) 

where \E\ and |R| is the number of edges and vertices in the minimal trellis of 
S respectively. Thus, a basis gives approximately the same compact description 
of a rectangular set as the minimal code trellis. Similar to trellis complexity, the 
cardinality of a code basis depends on the order of codewords positions. 

It was also shown in [8] that the Merging algorithm [5] , [2] applied to a trellis 
of a set A generates the rectangular closure [A] . The complexity of the minimal 
trellis of a closure [A] is less than the one of any trellis of the nonrectangular 
set A. This fact can be used to simplify iterative decoding algorithms [6], [7]. 
A similar problem of constructing a set with smallest trellis complexity was 
considered in [7]. The Wolf bound is not valid for nonlinear codes (even for 
rectangular codes). 

In this paper we continue investigation of rectangular codes. 

In Section 2 we define the rectangular complement operation and introduce 
the rectangular algebra. We believe that the rectangular algebra is an interesting 
mathematical object. We show that the algebra has not Exchange of independent 
sets property. Despite of this fact we conjecture (Gonjecture 12) that all bases 
of a rectangular code have the same cardinality (given by (2)). We propose an 
upper bound on the cardinality of rectangular closure of a given set. This bound 
was obtained independently also by Yu. Sidorenko [10]. 

In Section 3 we consider t— rectangular codes. This is equivalent to consider 
codes of length 2. We show that Gonjecture 12 is true for t— rectangular codes. 
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In Section 4 using results from rectangular algebra we propose lower and 
upper bounds on the cardinality of a rectangular code. The codes attaining the 
upper bound are called prime. The sufficient condition for a code to be prime is 
presented. 

2 Rectangular Algebra 

A universal algebra or, briefly, algebra A is a pair < A; F >, where A is a nonvoid 
set and F is a family of finitary operations on A. A is called the base set. Define 
the base set A to he A = = Q x Q ... x Q, where Q is a finite alphabet*. So, 

an element a G A is a vector of length n over Q. For every t G we define 

a ternary partial operation of t-rectangular complement rt:AxAxA^Aas 
follows. 

If a, &, c G A can be represented as the following concatenations 

a = P 2 fi,b = pifi,c = pif 2 , (3) 

where pi G Q*, fi G then rt{a, b, c) = P 2 / 2 , else rt{a, 6, c) is undefined. 

Definition 1 The partial algebra TZet =< A; rt > is called a t-rectangular alge- 
bra. 

The rectangular algebra can be defined as < A; ri, . . . , r„_i > having n — 1 
operations. However, we can simplify the definition of the algebra as follows. 

Extend the alphabet Q by joining a special zero element 9, Qs = Q U {6} 
and define the partial operation sum as follows: Va, /?, 7 G Qe 

1. a -\- a = 9; 

2. cx -\- j3 = j3 -\- 0:5 

3. a-\- 9 = a; 

4. (a + /?) + 7 = a + (/3 + 7). 

The sum of words over Q 0 is defined as componentwise sum. 

Lemma 2 If rt{a, 6, c) is defined then rt{a, h,c) = a + & + c. 

Proof. If rt{a,b,c) is defined then there exist t-words pi,P 2 and (n — t)- 
words /i, /2 such that (3) is satisfied and rt{a, b, c) = P2/2. On the other hand 
a+ & + C = (P2 + (Pl +Pl)){{fl + fl) + /2) = (P2 + 9){9 + /2) = P2f2- 

Corollary 3 If both ri{a, b, c) and Vj{a, b, c) are defined then ri(a, b, c) = rj{a, b, c). 

This Corollary allows to define rectangular algebra using only one operation 
instead of n — 1 operations. 

Definition 4 The rectangular complement operation r(a, b, c) is defined as fol- 
lows. If there exists t such that rt{a,b,c) is defined, then r{a,b,c) = rt{a,b,c), 
else r{a, b, c) is undefined. 

* All results are valid for the base set A = Qi x Q 2 ... x Qn, where Qi is a finite 
alphabet. 
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Definition 5 The partial algebra TZe =< A;r > is called the rectangular alge- 
bra. 

Now we can give equivalent to (1) definitions of rectangular codes. 

Definition 6 A code C C A is called t-rectangular if <C;rt > is a subalgebra 
ofTZet, i.e., C is closed under operation rt- 

Definition 7 A code C C A is called rectangular if < C;r > is a subalgebra of 
TZe, i.e., C is closed under operation r. 



Definition 8 A rectangular code of minimum cardinality that includes a set G 
is called the rectangular closure of G and is denoted by [G]. 

Some properties of rectangular codes can be immediately obtained from the 
theory of universal algebra [9]. For example, the intersection of rectangular codes 
is rectangular [1], since the intersection of subalgebras is a subalgebra [9]. The 
Rectangular closure is unique [9] . 

We say that a set G generates a rectangular set G if [G] = G. How many 
words can be generated by a set G? The following theorem gives an upper bound 
for the rectangular closure of the set G. 

Theorem 9 |[G]| < 

Proof. Let G = {^fi, . . . , gm}- Every word c G G = [G] can be generated by 
words from the set G using a chain of rectangular complement operations. So, 

c= r{yi,y 2 ,yz), 

where 

yi = r{z["\z^''\z^^^),i = 1,2,3, 

and so on until we get the rectangular complement of words from the set G. 
Using Lemma 2 for this chain of rectangular complements we get 



c = yi+V2 + 2/3, 






2/2 +2/3, 



c — gil + gi 2 + ■ ■ ■ + 9in ( 4 ) 

where I is odd since one item is always replaced by 3 items. Since the sum is 
commutative and distributive we can rewrite 4 as 



C = kigi + ^202 + ■ ■ . + kmgm, 

where ki is integer and kg = g g (k times) by definition. Ski = I is odd. 

Using properties 1 and 3 of the sum operation we get 



C = jl9l + j292 + . . ■ + Jm5m, 

where ji G {0, 1}, Sji is odd. So, to each word c G [G] corresponds a binary 
sequence ji, . . . ,jm of odd Hamming weight. Hence, the number 2™“^ = 21*^1“^ 
of such sequences gives an upper bound for |[G]|. Q.E.D. 
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Definition 10 A set G C A is called independent if for every g G G : g ^ [G\(/]. 



Definition 11 A set B C A is called a rectangular basis of a rectangular code 
G if B is independent and [B] = G. 

An important question for any universal algebra is: “Do all bases of a closed 
set have the same cardinality?” . 

Conjecture 12 All bases of a rectangular code have the same number of words. 

The invariance of the number of elements in a basis follows from the following 
exchange property: let y,z ^ [A] and z € [AUy], then y G [AUz]. Unfortunately, 
the rectangular algebra does not have this property. To prove, consider A = 
{0, 1}^, X = {(100), (010), (001)}, y = (000), z = (111). However we still think 
that Conjecture 12 is true. In the next section we show that the conjecture is 
true for length 2 codes. 

3 Rectangular Codes of Length 2 

Assume that we are interested in t-rectangularity of a code C of length n for a 
particular t only. Denote by Qp the set of all t-heads of the code and by Q/ the 
set of all t-tails. Each codeword of the code can be represented as a word pf, 
P G / G Q/- In this section we consider only length 2 codes over alphabets 

Qpj Qf- 

Every rectangular code can be represented as a union of disjoint subcodes 

[ 1 . 2 ] 

m 

C = IJ P, X Fi, (5) 

i=l 

where Pi C Qp, Fi C Qf, ^*0 Pj = 0. * 7^ J> ^tnd every subcode 

Gi = Pi X Fi is a direct product: Gi = {pf : p G Pi, f G Fi}. The following 
theorem gives the cardinality of a basis of a direct product. 

Theorem 13 Let B be a basis of length 2 code G = P x F, then 

|P| = |P| + |F|-1. (6) 

Proof. Consider a |P| x |F| rectangular array. Rows of the array are labeled 
by words of the set P, columns are labeled by words from F. A word c = pf 
from the code G will be represented by a cross in the array in row p and column 
/. The code G is represented by the array A{C) completely filled with crosses. 

Let us fill the empty array with words of a basis B of the code C, denote the 
array by A{B). We say that three words a,b,c that satisfy (3) form a triangle 
because they occupy 2 columns and 2 rows in the array. These words generate 
the fourth word d = r{a, b, c) situated in the same lines. Using this procedure of 
cross generating the complete array must be filled starting from A{B). 
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1. Every line (row or column) of the array A{B) must contain a cross. Oth- 
erwise this line will not be filled. 

2. Let the first column of the array A{B) have ni > 1 crosses. These crosses 
occupy ni -I- 1 lines (rii rows and 1 column). Denote by Lr the set of these rows 
and by Lc the set consisting of the first column. The cells Lr x Lc of the array 
A{B) are filled with crosses. 

3. Among the rest columns F\Lc there exists a column / having a cross in 
rows Lr- Otherwise the crosses in cells Lr x Lc are isolated and the array will 
not be filled; this contradicts the fact that B is a basis. 

The column / has exactly 1 cross in rows Lr, say in row p. If it has more 
than 1 cross then B is dependent because the set of crosses Lr x Lc and one 
cross pf will generate all crosses Lr x /. 

Join the column / to the set Lc and join rows occupied by crosses in column 
/ to the set Lr- Totally we join n/ new occupied lines, where nf is number of 
crosses in the column /. Using the column / the whole array Lr x Lc can be 
filled. 

4. We will repeat step 3 until all columns of the array will be exhausted. 
After this we have Lc = F and Lr = P because the whole array must be filled. 
Hence the number of occupied lines will be |P| -I- |F|. On the other hand by 
construction the number of occupied lines is 

l+'£nf = l + \B\ = \P\ + \F\, 
feF 

from where the statement of the theorem follows. Q.E.D. 

Since tailsets Ej and headsets Pi in (5) are disjoint from Theorem 13 we get 

Theorem 14 Let B he a basis of length 2 code C that satisfies (5), then 



i 

The minimal trellis of a length 2 code has m + 2 vertices and + 1^*1) 

edges. So, from Theorem 14 we have 

Theorem 15 Let B he a basis of length 2 rectangular code having the minimal 
trellis T= {V,E) then \B\ = \E\ - |U| -b2. 

Since the minimal trellis of a rectangular code is unique it follows from The- 
orem 15 that all bases of a rectangular length 2 code have the same cardinality. 

4 Bounds on the Cardinality of a Basis 

Now we return to the general case of rectangular codes of length n. What can 
we say about cardinality of a basis B{C) of a code C if only \C\ and \Q\ are 
known. From Theorem 9 we get 
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Theorem 16 The cardinality of a basis B{C) of a binary rectangular code C is 
bounded by 

log^\C\ + l<\B{C)\<\C\. 

The lower bound can be attained for a binary alphabet. So in binary case 
the bounds can not be improved if only ICj is available. 

There exist a wide class of rectangular codes that attain the upper bound. 
We call these codes prime because they can be generated only by the complete 
code. 

Definition 17 A rectangular code C that satisfies B{C) = C is called prime 
code. 

The following theorem gives a sufficient condition for a code to be prime. 

Theorem 18 Let C be a rectangular code having minimum distance d{C) and 
diameter D{C) in the Hamming metric and 

2d{C) > D{C), (8) 

then \B{C)\ = \C\. 

The condition (8) in Theorem 18 can be replaced by d{C) > nj2. It follows 
from Theorem 17 and from the Coloring algorithm that \E\ — \V\+2= \C\ if (8) 
is satisfied. So \E\ — \V\ does not depend on reordering of codewords positions 
and one can find a permutation-minimal trellis that minimizes simultaneously 
\V\ and \E\, when \E\ — \ V\ is constant (this means minimization of the Viter bi 
decoding complexity). 
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Abstract. We present an efficient implementation of Sudan’s algorithm 
for list decoding Hermitian codes beyond half the minimum distance. 
The main ingredients are an explicit method to calculate so-called in- 
creasing zero bases, an efficient interpolation algorithm for finding the Q- 
polynomial, and a reduction of the problem of factoring the Q-polynomial 
to the problem of factoring a univariate polynomial over a large finite 
field. 



1 Introduction 

In 1997 M. Sudan [1] presented a conceptually easy algorithm for decoding Reed- 
Solomon codes of rates less than i beyond half the minimum distance. The 
method was extended to algebraic geometry codes by M.A. Shokrollahi and H. 
Wassermann in [2], and M. Sudan and V. Guruswami in [3] further improved 
the algorithm to cover all rates both for Reed-Solomon codes and for general 
algebraic geometry codes. It is clear from [3] that the resulting algorithm has 
polynomial complexity, but also that some more work was needed to make the 
computations really efficient. In this paper we address that problem. The paper 
is organized as follows. Section 2 gives the necessary background on algebraic 
geometry codes, in particular the codes that come from the Hermitian curve. 
Section 3 gives the prerequisites on multiplicities and the concept of increasing 
zero bases, and Section 4 presents and proves Sudan’s algorithm. Section 5 gives 
an efficient method for calculating increasing zero bases and Section 6 gives a 
fast interpolation algorithm for finding the Q-polynomial. Section 7 treats the 
factorization problem and reduces it to factoring a univariate polynomial over 
a large finite field for which Berlekamp’s algorithm [4] can be used. Section 8 is 
the conclusion. A version including more detailed proofs and an example can be 
obtained at 

2 Hermitian Codes 

Let X be a nonsingular absolutely irreducible curve over Fg and let Pi , . . . , P„ , Poo 
be Fq-rational points on x- The curve defines an algebraic function field, Fg(x), 
with a discrete valuation, vp. , corresponding to each point (i = 1 ,... ,n, oo). 
More details can be found in [5]. 



Marc Fossorier et al. (Eds.): AAECC-13, LNCS 1719, pp. 260—269, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 




Decoding Hermitian Codes with Sudan’s Algorithm 



261 



A class of algebraic geometry codes is given by 

Cc{Pi + ■ . . + Pn,mPoo) = {(/(Pi),... ,f{Pn)) I / G P(toPoo)} , m< u 

where £{mPoo) = {/ G F,(x) | vp^ (/~^) < m A vp. (/) > 0 for t = 1, . . . , n}. 
The length of this code is n, and if g denotes the genus of y and 2g—l < m < n 
then the dimension of the code is k = m — g + 1 and the minimum distance is 
lower bounded hy d* = n — m since the number of zeroes of a non-zero function 
cannot exceed the number of poles. 

The Hermitian codes over F ^2 are the codes defined by the above method 
using the Hermitian curve as y: 

- Y = 0 

It is well-known that this curve indeed is nonsingular and absolutely irreducible. 
Furthermore, the curve contains qf affine F^ 2 -rational points and has genus 

gi(9i-i) ^ case, the point Poo corresponds to the (unique) point at in- 

finity on the homogenization of the Hermitian curve. 

In the following it will be assumed that a function field, F,(y), is given and 
that Pi, . . . , P„, Poo are points on a nonsingular and absolutely irreducible curve, 

X- 

3 Prerequisites 

For £ >2g—l, C{£Poo) is a vector space over F, of dimension £ — g + 1. It is well 
known that C{£Poo) has a basis, . . , 4>e-g+i where the pole order at Poo is 
increasing: 



VPoo(<('l ^) < ^P^(4'2 ^) <■■■ < VP^{(l}l-g+l) 

However, the following theorem (from [3]) shows the existence of bases where 
the zero multiplicity of a given point - different from Poo - is increasing. Fur- 
thermore, the proof of the theorem describes a strategy to find these bases. 

Theorem 1. Let Pi (i € {1, ... ,n}) be a point and let £ > 2g — 1 . Then there 
exists a basis, ^ly, . . . of L{£ Poo) such that 

vpi((/>i,i) <ypi{4>2,i) <••• <ypX4>i-g+i,i) 



In the following, such a basis will be called an increasing zero basis with respect 
to the point Pi . 

Proof. Suppose that some basis, B = {<pi, . . . ,cj)i-g+i}, of C{£Poo) is given. 
Suppose that two function have the same valuation at Pi. Then one of them can 
be replaced by a suitable linear combination of the two having greater valuation. 
This can be repeated until none of the basis functions have the same valuation 
at Pi and an increasing zero basis is obtained. □ 
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Recall that the non-negative integers, N, are divided into gaps and non-gaps 
by calling s G N a gap if and only if £{sPoo)\£{{s — l)Poo) = 0- The number of 
gaps equals the genus, g, of the curve defining the function field. For t G N, g(t) 
denotes the number of gaps less than or equal to t. That is 

g(t) := t - dim(£(tPoo)) + 1 (1) 

Let R denote the following vector space: 

OO 

R.= \J /:(^Poo) 

i=0 

Suppose that R = span{(/>j | i > 1} with the poleorders of the (j>i’s being strictly 
increasing. Then R[z] = spanj^jZ-^ I * ^ 1 A j > 0} (where z is transcendental 
over Fq(x)). We will define a total ordering on these basis functions by associating 
a non-negative integer - called the weight - to each function. The ordering will 
be parameterized by the number associated with z. Let this be denoted by p{z). 
Then the weight of the basis function (piZ^ is given by 

p{(j)iZ^ ) = vp^ + jp{z) (2) 

An ordering can now be defined using some lexicographic rule to break ties, for 
example: 

(j)iZ^ < (j>aZ^ P{4>iZ^) < P{4>aZ^) V {p{(l)iZ^) = p{4>aZ^) A j < b) (3) 

However, in this context only the weighting is important. 

p is extended to any non-zero function in R[z] by the following definition: 

Definition 1. Let f G i?[z]\{0}. Suppose that f = fi,j4>iZ^ and that p{z) 
is given. Then the weight of f is defined as follows where p{(j)iZ^) is given by (2): 

p{f) = max{p((/)iZ'^) I /ij yf 0} 

The following lemmas describes the weight of the basis functions and the 
concept of zero multiplicities (proofs are omitted): 

Lemma 1. Let p{z) > 2g — 1 be given and suppose that the basis functions of 
R[z] are enumerated as Qo, Qi , ... so that p{Qo) < p{Qi) < ■■ ■■ 

Let j gN be given. Then let r and t satisfy 

Q P{z) -{r- l)g < j < ^ p{z) - rg 

tr - g{t) <j-{ Q p{z) -(r- 1 ) 5 ) <{t + l)r - g(t -h 1) 

where g(t) is given by (1). 

The weight of Qj is now given by 

p{Qj) = (r- l)p{z)+t 
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Definition 2. Let f G R[z] with f = fji^ ~ ^oY for some zq G F^. 

Then the pair, (Pi, zq), is said to be a zero of multiplicity s of f if 

^Pi(fj) >s-j for all j <s 

and ypi(fj) = s — j for some j < s. 

Lemma 2. If (Pi,zo) is a zero of multiplicity s of f then vp. (f(t + zq)) > s for 
any t G R with vp^ (t) > 1 . 

Lemma 3. Let , (f>i-g+i^i be an increasing zero basis with respect to Pi 

and consider a non-zero polynomial, f G R[z] . f can then be written as 

U 

f{z) ='^'^fj,k<Pk,zZ^ 
i=o k 

with u > deg(f). If f'j^k ■= Yd=j flk Q Zq~^ = 0 for all j k < s then (Pi, zo) 
is a zero of multiplicity at least s of f. 

4 Sudan’s Algorithm 

Let B(w,r) denote the ball in with center w and radius r: 

B(w,r) := {m G F^ I d(w,'u) < r} 

The decoding problem for a code, C C F”, and a received word, w G F^, can 
then be specified as calculating the set 

deCr(w) := C D B(ru, r) 

where r > 0 is an integer. If r is smaller than half the minimum distance, then 
dec,-(tc) will always contain at most one codeword, however, we will allow t to 
be greater than or equal to half the minimum distance. When that is the case, 
decoding may not be unique since deCrlw) may contain two or more codewords. 
This is therefore called list-decoding, and we will refer to t as the error-correcting 
capability of a decoding algorithm which is capable of calculating deCr(w) for 
any received word, w. 

The version of Sudan’s algorithm given below was first presented in [3] . The 
algorithm can be seen as an extension of the generalization of Sudan’s original 
algorithm to the case of algebraic geometry codes (see [2] and [1]). The extension 
gives an improved error-correcting capability over the original algorithm at all 
information rates if the code is sufficiently long. The description used here is 
from the presentation of Sudan’s original algorithm in [6]. 

Algorithm 1. 

Input: The code Cc(Pi -I- • • • -I- Pn, (k g — with k > g — 1, a received 

word, w = (wi, . . . , Wn) G F^, and a parameter, s > 1. 

Output: deCr^ (w) . 
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— Set p{z) 




:= O'fid calculate Vg and t as in Lemma 1 with j := 

. Now 



Ts = n- 



{rg - l)p{z) + t 
s 



- 1 



— Calculate Q{z) G i?[z]\{0} so that 

1. For all i, Q has {Pi,Wi) as a zero of multiplicity at least s (see Defini- 
tion 2). 

2. p{Q) < {rg - l)p{z) + t 

— Factorize Q into irreducible factors. 

— Ifz-f divides Q and f G C{{k-\-g-l)Pao) and d((/(Pi), . . . , /(P„)), w) < Tg 
then add (/(Pi), . . . , f{Pn)) to the set of candidates. That is 



dec^,(w) := {(/(Pi), ... ,f{Pn)) I / G £((fc + 5 - l)Poo) A (z - f)\Q A 
d((/(Pi),... ,f{Pn)),w)<Tg} 

Any polynomial, Q, satisfying the two conditions in the algorithm will be 
called a Q-polynomial (with zero multiplicity s) in the following. The correctness 
of this algorithm must be shown by proving the existence of a Q-polynomial and 
proving that such a polynomial has the right factors. The existence is given by 
the following theorem: 

Theorem 2. A Q -polynomial does exist. 

Proof. By Lemma 3 it is clear that the first condition on a Q-polynomial can 
be written as a system of homogeneous linear equations. By Lemma 1 there is a 
non-zero solution satisfying the second condition. □ 

The fact that a Q-polynomial has all the factors corresponding to codewords 
in deCr^ (w) is proved by the following lemma and theorem: 

Lemma 4. Let f G £(mPoo) and suppose that f{Pi) = Wi. Thenvp^{Q{f)) > s. 

Proof. Follows from the definition of Q and Lemma 2 since vp^if — Wi) >1. □ 



Theorem 3. Let Q be a Q-polynomial and suppose that f G C{k -I- g — 1) with 
d{{f{Pi),... ,f{Pn)),w) <Tg. Then {z - f)\Q. 

Proof. Let h = Q{f). Then p{h) < {vg — l)p(z) -I- t. By Lemma 4 vpi(h) > s 
for each value of i where /(Pi) = Wi. This happens at least n — Tg times. So the 
total number of zero multiplicities of h is at least 



n 

^vPi(h) > s{n - Tg) 

i=l 



s( 



{Vs 



I)P(^) +t 

S 



+ 1 ) > (Vs 



l)p{z) -\-t> vp^ {h ^) 



So /i = 0 and therefore z — f must divide Q. 



□ 



Remark 1. Notice that by Lemma 1 the degree of Q must be less than rg. This 
means that Sudan’s algorithm gives at most rg — 1 codewords as output. So 

|dec^,(ic)| < Tg 
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5 Calculating Increasing Zero Bases 

In principle, the method for calculating increasing zero bases, which is described 
in the proof of Theorem 1 is perfectly fine. However, in practice there are some 
unsolved problems since it is not trivial to calculate the standard representation 
of a function and evaluate the unit. In this section both problems are solved in 
the case of the Hermitian function field. 

Suppose that we want to calculate the standard representation of some poly- 
nomial, f G Op^, with respect to the point Pi = (xi,yi) (i G {1, . • . ,n}). In this 
case X — Xi is a valid local parameter in Pi. Since / is regular in Pi the valuation 
of / in Pi can in principle be calculated by repeatedly dividing f hy x — Xi until 
a unit (a function which evaluates to a non-zero value in Pi) is obtained. 

/ can now be written as 



f ^^^faAx-Xi)°‘{.y-Vi)^ (4) 

a=0/3GN 

where fa,p = with G F^ 2 , = 0 for all but a finite 

number of values of j, and e := (jj — yi)‘^^~^ + 1. It should be mentioned that 
this representation of / is not unique, however, that will not be a problem in 
this context. The idea of using this representation is that / can now be divided 
hy X — Xi in such a way that the result is a function which can be written on the 
same form. This is seen by noticing that in the Hermitian function field we have 

y- Vi _ (X - Xj)'^^ + Xj(x - Xj)‘^^~^ + xf 
X — Xi e 

e is a unit since e{Pi) = 1 and furthermore f{Pi) = fo,o,j- In the case 

where / is not a unit {f{Pi) = 0) let s := min{j | fo,o,j ^ 0}- Then 

/o,o = s'* ^ /o.opet”® 

jez 



where fo,o,jS^ is a polynomial in y, which is divisible hy y — yi, since / 

is not a unit. So with h := /o.o.je — have 

y-Vi 

= e^- — —h = — Xi)'^^ + Xi{x — -I- (6) 

X - Xi X - Xi '■ 

This leads to the following algorithm for calculating the standard represen- 
tation of a polynomial in the Hermitian function field: 

Algorithm 2. 

Input: Polynomial, f, and point Pi = (xi,yi). 

Output: u and m so that m = yp^if) and f = u{x — Xi)'^. 

1. Initialization: := / = Y.a,p fa, 0 {x - Xi)°‘{y - yi)^ , m := 0. 

2. If fo,o,j 7^ 0 then let u := and stop. 
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3. Use equations (6) and (5) to calculate 




X — Xi 



'y* . 'y /y . 

vO 7, _ _ U./ vl/ 7. 

a>l,/3 /3>1 

Lei TO := m + 1 and go to step 2. 

The above algorithm for calculating standard representations provides all 
what is needed to implement the method for determining increasing zero bases 
in the proof of Theorem 1 since the unit of the standard representation is given 
on a form which can be evaluated directly. 



6 Interpolation 



The goal of the interpolation step is to determine a valid Q“Polynomial. As 
mentioned in Theorem 2, such a polynomial must exist in the vector space 



spanjQo, ■ • • ,Q^} , ^ 




n 



with Qo,--- ,Qe being as in Lemma 1, and a Q“Polynomial can be found by 
solving a system of linear equations using Gaussian elimination (In the following 
each of these equations will be referred to as a zero-condition, see Lemma 3). 
However, the system has a special structure, and that can be used to make 
the calculations more efficient. One method for doing this is described in the 
following. The method is an application of the Fundamental Iterative Algorithm, 
along the same lines as the application in [7], Chapter 4. The Fundamental 
Iterative Algorithm was first presented in [8] . 

Let ord : R[z] Noo be given by ord(O) = — oo, ord(span{Qo}\{0}) = {0} 
and 



ord(span{Qo, ■ • • , Qi}\span{( 5 o, • • ■ , Qi-i}) = {*} 

for t > 0. If / S R\z] then ord(/) will be called the order of the function /. 

The task of finding a Q-polynomial can now be rephrased as the task of 
finding a polynomial which satisfies the zero-conditions and which is minimal 
with respect to ord. It will be safe only to look for such a polynomial in the 
vector space 



V := {fz^ \ f & R UO < j < max{deg(Qi) | 0 < i < £}} 

The idea of the algorithm which is presented in the following is to make a 
partition of V and then consider the zero-conditions one by one while maintain- 
ing a list of polynomials - one from each partition class - which all satisfies the 
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zero-conditions considered so far, and which all are minimal within the given 
partition class. 

To do this we need for each point a (small) element of R\z] which has a 
given pair, (Pi,Wi), as a zero of multiplicity at least s and which whenever it is 
multiplied by a polynomial gives a result within the same partition class as that 
polynomial. To make this work, the order of these elements must be the same 
for all i G {1, . . . , n}. Therefore, choose t as the smallest integer so that 

ord((/)j(j),i) = t A vp, >s ,z = l,...,n 

Now construct a partition of V doing the following: 

Let 

A := ord(y)\{i G ord(F) | 3/ G ord“^({i})3/i G ord“^({t}) : h\f} 
Furthermore, let h G ord~^({t}) and define 

OO 

T ord“^(ord(/i*)) 

i=0 

Notice that this definition of T does not depend on the choice of h. 

Let A= {oi, . . . , a|^| } and let 

Gj := ord~\ord{{fjg \ fj G ord"^({aj}) A 5 G T})) 

Now C?i, . . . , G\a\ is a partition of V , and furthermore, if a polynomial in some 
Gj is multiplied by a polynomial in ord~^({t}) then the result will remain in 
Gr 

Finally, the following notation is needed: Let / G R[z] be written as / = 
Y.j,kfhk4>k,iZ^ then 

coei{f,<j)k,iZ^) ■■= fj,k 
Now the algorithm can be stated: 

Algorithm 3. Input: Pairs, (Pi, wi), . . . , (P„, w„), and required zero multiplic- 
ity, s. 

Initialize by setting 

G := {gi, . . . ,g\A\} 

so that ord(G) = A (which means ord(gj) = minord(Gj) for j = 1, . . . , \A\). 
For i = 1, . . . ,n do the following: 

For each pair (j, k) G with j k < s and k > 1 do the following: 

Let G* := {g e G \ coef{g{z Wi),(j)k,iZ^) yf 0}. 

If G* ^ ^ then choose f € G* so that ord(/) = minord(G*) and let 
G ■= U {coef(/(z -k Wi),(t>k,iZ^)g - coei{g{z + Wi),(|)k,^z^)f} 

After this, the result is given by the polynomial in G which is smallest with 
respect to ord. 

The correctness of this algorithm is proved by induction over the iteration 
steps since G at any time holds one polynomial from each partition class satis- 
fying the zero-conditions considered so far and minimal with respect to ord. 
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7 Factorization 

The Q-polynomial is a polynomial in R[z] and therefore, factorization is not so 
easy. However, fairly simple and efficient methods exist for factoring univariate 
polynomials over a finite field (see for example [4] , Chapter 4) . In this section the 
problem of factoring the Q-polynomial is transformed into a problem of factoring 
a univariate polynomial over a (large) finite field. 

In the case of Hermitian codes, R = F ^2 [Jf, F]/(X9i+^ — — Y) and 

R = span{a;“j/^ |0<o;<qiA0< /?}. So seen this way, polynomials in 
spanjQo, • • • ,Qe} have degree in x smaller than gi + 1 and degree in y smaller 
than some integer, c. Now let f{Y) e Fg 2 [F] with deg(/) > c and 

Z?i=F,2[F]/(/(F)) 

Furthermore, let g(X) := — Y'^^ — Y mod / and 

D2 = DilX]/(g(X)} 

Consider the mappings, (j) '■ [- 2 ] ^ and 9 : i4i[X][z] ^ D 2 [z], 

given by 



4>{hi) = hi mod / 



^(^ 2 ) = ^2 mod g 

It is a well-known fact that these mappings are homomorphisms and that the 
composition of these mappings, 9(j), is again a homomorphism. So suppose that 
h e £{mPoo) and that z — h \ Q then 9(j){z — h) \ 9<j){Q) and furthermore, 
reducing z — h modulo / and g will not change z — h since the degree in F of / 
and the degree in X of g is higher than the degrees of ft. in F and X respectively. 
Therefore, in the factorization step, it will be sufficient to factorize 9(j){Q), since 
this will still reveal the factors corresponding to codewords within distance Ts 
from the received word. 

This will be very useful if /(F) is chosen so that D 2 is a finite field. That 
will be the case if and only if Di is a finite field (/ is irreducible over F^ 2 ) and 
g{X) is irreducible over Di. 

Suppose that / is irreducible so that Di ~ F 2 ci , where ci = deg(/). Notice 
that g{X) = — (y^^ -I- y) then is a binomial in F^ 2 ci [X]. The question is 

now if / can be chosen with degree at least c so that g is irreducible. That this 
is in fact the case is shown in the following. 

The following theorem, which is a special case of theorem 3.75 of [4] is needed: 



Theorem 4. Let lo G Fg2c\{0} and let e denote the order of to. Then — w 

is irreducible over F^ 2 c i) and only if each prime factor of qi + 1 divides e but 

e 



not 
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Notice that since + 1 divides — 1 the theorem states that — w is 

irreducible if and only if gcd(gi + 1, — ) = 1 with e being as in the theorem. 

The existence of a polynomial as the one called / above is given by the 
following theorem (the proof is omitted here): 

Theorem 5. Let c> 1 be an integer. Then there exists a polynomial, f(Y), so 
that deg(/) > c and the order, e, of +y in F ^2 [F]/(/(F)) satisfies 

gcd(gi + 1, ^ ) = 1 

e 

It should be mentioned that experiments indicate that irreducible polynomi- 
als with the property described in Theorem 5 are rather common, so in practice 
it seems to be sufficient to generate random irreducible polynomials and check 
if they have the right property. However, we have no proof that this will always 
work. 

8 Conclusion 

We have demonstrated how to decode Hermitian codes efficiently beyond half the 
minimum distance using Sudan’s algorithm. The main steps are calculation of 
increasing zero bases, fast interpolation in order to determine the Q-polynomial, 
and a fast method of factorization. The actual complexity of the overall algorithm 
remains to be calculated and the extension to more general algebraic geometry 
codes is a subject for future work. 
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Abstract. Under the assumption that we have defining equations of an 
affine algebraic curve in special position with respect to a rational place 
Q, we propose an algorithm computing a basis of C{D) of a divisor D 
from an ideal basis of the ideal C{D -\- ooQ) of the affine coordinate ring 
C{ooQ) of the given algebraic curve, where C{D + ooQ) := USi 71(1) + 
iQ). Elements in the basis produced by our algorithm have pairwise 
distinct discrete valuations at Q, which is crucial in the construction of 
algebraic geometry codes. Our method is applicable to a curve embedded 
in an affine space of arbitrary dimension, and involves only the Gaussian 
elimination and the division of polynomials by the Grobner basis of the 
ideal defining the curve. 



1 Introduction 

For a divisor D on an algebraic curve, there exists the associated linear space 
C{D). Recently we showed how to apply the Feng-Rao bound and decoding 
algorithm [8] for the 12-construction of algebraic geometry codes to the C- 
construction, and showed examples in which the >C-construction gives better 
linear codes than the ^-construction in certain range of parameters on the same 
curve [15]. In order to apply the Feng-Rao algorithm to an algebraic geometry 
code from the £-construction, we have to find a basis of the differential space 
Q{—D + mQ) whose elements have pairwise distinct discrete valuations at the 
place Q. Finding such a basis of fi{—D + mQ) reduces to the problem finding 
a basis of C{D') whose elements have pairwise distinct discrete valuations at Q 
as described in [16]. Finding such a basis of Q{—D + mQ) is also necessary in 
the precomputation of the efficient encoding method proposed in [16]. But there 
seems to be no algorithm capable to find such a basis of C{D'). In this paper we 
present an algorithm computing such a basis. 

* 2000 Mathematical Subject Classification. Primary 14Q05, 13P10; Secondary 94B27, 
11T71, 14H05, 14C20. 
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An affine algebraic curve with one rational place Q at infinity is easy to 
handle and used in many publications [9,17,18,19 j 20,21,22,23]. For a divisor D 
we define £{D+ooQ) := IJ^i £{D+iQ). An affine algebraic curve is said to be in 
special position with respect to a place Q of degree one if its affine coordinate ring 
is £(ooQ) and the pole orders of coordinate variables generate the Weierstrass 
semigroup at Q (Definition 2). Under the assumption that we have defining 
equations of an affine algebraic curve in special position with respect to Q, we 
point out that a divisor can be represented as an ideal of L{ooQ), and we propose 
an efficient algorithm computing a basis of C{D). 

For effective divisors A and B with supp A n suppB = 0 and Q ^ supp A U 
suppB, there is a close relation between the linear space C{A — B + nQ) and 
the fractional ideal £(A — B + ooQ) of £{ooQ), namely 

£{A - B + nQ) = {/ e £{A - B + ooQ) \ ug(/) > -n}, 

where vq denotes the discrete valuation at Q. When A = 0, by this relation we 
can compute a basis of £{—B + nQ) from a generating set of £{—B + ooQ) as 
an ideal of £{ooQ) under a mild assumption. 

When A > 0, we find an effective divisor E such that —E + n'Q is linearly 
equivalent to A — i? + nQ, then find a basis of £{—E + n'Q) from a generat- 
ing set of the ideal £{—E + ooQ), then find a basis of £{A — B + nQ) from 
that of £{—E + n'Q) using the linear equivalence. Computing an ideal basis 
of £{—E + ooQ) from A — B + nQ involves computation of ideal quotients in 
the Dedekind domain £{ooQ), but by clever use of the properties of an affine 
algebraic curve in special position, our method involves only the Gaussian elimi- 
nation and a small number of division of polynomials by the Grdbner basis of the 
ideal defining the curve, thus it is efficient. Moreover while the other algorithms 
[2,3,4,6,11,12,13,25,26] except [10] are applicable only to a plane algebraic curve, 
our method is applicable to a curve embedded in an affine space of arbitrary di- 
mension. The algorithm [10] is designed for an arbitrary projective nonsingular 
variety whose homogeneous coordinate ring satisfies Serre’s normality criterion 
S 2 (a definition of S 2 can be found in [7, Theorem 11.5]), and due to the wide 
applicability their method involves Buchberger’s algorithm that sometimes takes 
very long computation time. 

Due to the page limitation we had to omit all examples and most of proofs. 
For the complete version, please wait for the journal paper version or send email 
to the first author. 

2 Theoretical Basis for Computation 

First we fix notations used in this paper. K denotes an arbitrary perfect field. 
We consider an algebraic function field F/K of one variable. Pf denotes the set 
of places in F/K. For a place P, Op (resp. vp) denotes the discrete valuation 
ring (resp. discrete valuation) corresponding to P. Other notations follow those 
in Stichtenoth’s textbook [24] unless otherwise specified. 
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In this section, we introduce theorems which play important roles in ideal 
computation in the affine coordinate ring of an affine algebraic curve and com- 
putation of a basis of C{D). Hereafter we fix a place Q of degree one in F/K . 



2.1 Relation between Fractional Ideals of Nonsingular AfRne 
Coordinate Ring and Divisors in Function Field 

Definition 1. For a divisor D in F/K, we define £{D + ooQ) := [J^q£{F> + 
iQ). 

Then we have C{ooQ) = Op, and we can see that C{ooQ) is a 

Dedekind domain by [24, p.71]. By [24, Proposition III.2.9], the set of maximal 
ideals in £(ooQ) is {£(— P -I- ooQ) \ P G TP p\ {<5}}- 

Proposition 1. For a divisor D in F/K with Q ^ supp(I?), C{—D + ooQ) is 
a fractional ideal in C{ooQ). We have 

£{-D + ooQ)= £{-P + ooQy^^^\a 
PeFp 



Corollary 1. For two divisors D,E with support disjoint from Q, 

£{-D + ooQ)£{-E + ooQ) = £{-{D + E) + ooQ), 

£{—D + ooQ) + £{—E + ooQ) = £(— min{v p{D),vp{E)}P + ooQ), 

PGQ 

£{—D + ooQ) n £{—E + ooQ) = £(— max{v p{D),vp{E)}P + ooQ), 

PGQ 

£{—D + ooQ) : £{—E + ooQ) = £{— max{0, vp{D) — vp{E)}P + ooQ). 

PGQ 

Proof. The assertion follows from Proposition 1 and the unique factorization of 
an ideal in a Dedekind domain [27, Theorem 11, Section 5.6]. □ 

By the facts described so far, we can show a preliminary version of our method 
for a basis of £{D). Let 



D := A — B + nQ, 



where A and B are effective, suppH n suppP = 0 and Q ^ suppH U suppP. 
Suppose that generating sets of the ideal £{—A + ooQ) and £{—B + ooQ) are 
given. If A = 0, then 

£{D) = {x G £{—B + ooQ) I vq{x) > —n}. 

From the equation above if we have a basis of £{—B + ooQ) as a PT-linear space 
with pairwise distinct pole orders at Q, then finding a basis of £{D) from that 
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of £{—B + ooQ) is just picking up elements in the basis of £{—B + ooQ) with 
pole orders < n. We shall show how to compute such a basis of £{—B + ooQ) 
from a generating set of the ideal £{—B + ooQ) in Theorem 1. 

If A ^ 0, then choose a nonzero element f G £{— A + ooQ). Let (/) be the 
ideal generated by / in £{ooQ), and 

/ = {{f)£{-B + ooQ)) : £{-A + ooQ). 



Then 



I = £(—(/) + ooQ)£{—B + ooQ) : £{—A + ooQ) 

= £{A - B - (f) + ooQ). 

Since £{A — B — (/) + ooQ) is an ordinary ideal of £{ooQ), we can compute a 
basis of £{A — B + nQ — (/)), say 6i, . . . , 6 /. Then 6i//, ... ,bi/f is a, basis of 
£{D) = £{A — B + nQ). We have to compute an ideal quotient in the method 
above. We shall show how to compute an ideal quotient only with linear algebra 
techniques in Section 4. 



2.2 Modules over C{cx}Q) 

In this subsection we shall study how we can represent an >C(oo(5)-submodule of 

F. 

Proposition 2. For a K -subspace W of £{ooQ), suppose that there is a subset 
C W such that VQ(a.j) = j . Then {ci!j}jgt,Q(w\{o}) o, K-basis 
ofW. □ 

Let a G —vq{£{ooQ) \ K). Then a > 0. Fix an element x G F such that 
(^)oo — aQ . 

Proposition 3. For an £{ooQ)~ submodule M of £{ooQ), we set bi := min{j G 
—vq{M \ {0}) I j mod a = i} for i = 0, . . . , a — 1. Choose elements yi G M such 
that VQ^yi) = —bi. Then {yo,yi, . . . ,j/o-i} generates M as a K[x]-module. □ 



Proposition 4. We retain notations from the previous proposition. If a K- 
subspace W generates M as a K[x]-module, that is, M = K[x]W , then we can 
find the elements yi in W for t = 0,...,a— 1. □ 

3 Grobner Bases of an AfRne Algebraic Curve with a 
Unique Rational Place at Infinity 

An affine algebraic curve with a unique rational place at infinity is convenient 
and has been treated by several authors independently [9,17,18,19 j 20,21,22,23]. 
we review and extend results in [19,20,23]. 
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Definition 2. [23, Definition 11] Let an ideal I C K[Xi,... ,Xt] define an 
affine algebraic curve, R := K[Xi, . . . ,Xt]/I, F the quotient field of R, and Q a 
place of degree one in F/ K. Then the affine algebraic curve defined by I is said 
to be in special position with respect to Q if the following conditions are met: 

1. The pole divisor of Xi mod I is a multiple of Q for each i. 

2. The pole orders of Xi mod I, Xi mod I, . . . , Xt mod I at Q generate the 
Weierstrass semigroup {i \ L{iQ) L{{i — 1)<5)} at Q. In other words, for 
any j G {z | L{iQ) L(fi — 1)<5)} there exists nonnegative integers l\, . . . fit 
such that 



t 

j = ^ -hvQ{X^ mod /). 

The Weierstrass form of elliptic curves can be considered as a special case of 
curves in special position. 

Proposition 5. We retain notations from Definition 2. R = L{ooQ) and the 
affine algebraic curve defined by I is nonsingular. □ 

If an algebraic curve is not in special position, then the proposed method 
cannot be applied to it. We can put an arbitrary algebraic curve into special 
position using Grobner bases if we know elements in the function field which 
have their unique pole at some place Q of degree one and their pole orders 
generate the Weierstrass semigroup —vq{C{ooQ) \ {0}) [23, p.l739]. 

In another direction, it is convenient to have a class of algebraic curves known 
to be in special position. Miura found the necessary and sufficient condition for 
a nonsingular nonrational affine plane curve to be in special position [19,20]. 
An affine algebraic set defined by F(X,Y) = 0 is a nonsingular nonrational 
affine algebraic curve in special position with respect to Q if and only if it is 
nonsingular and 



F{X, Y) = ab,oX>> + ao,aY^ + 

ai-\-bj<.ab 



where atj G K, both ab,o and Oo.o are nonzero and a and b are relatively 
prime positive integers^. In the above situation vq{X mod F{X, 1")) = —a and 
vq{Y modT'(A, y)) = —b. Then he generalized this result to curves in affine 
space of arbitrary dimension [19,20]. 

We use the theory of Grobner bases. Basic facts in the theory are explained in 
[5]. Hereafter I C K[Xi, . . . , Xt] denotes an ideal defining an algebraic curve in 
special position with respect to a place Q of degree one of the function field F of 
the curve, unless otherwise stated. We fix a monomial order ^ on K[Xi , . . . , Xt] 
induced by the discrete valuation at Q. IMq denotes the set of nonnegative inte- 
gers. 

^ Although all published proofs of that fact are in Japanese, an English proof can be 
found in [14]. 
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Definition 3. define Xp ■ ■ ■ ^ X^^X^^ ■ ■ ■ X”* if 

-vq(X™^ • • • X™* mod I) < -fQ(X”i • • • X™* mod I), 
or 

-vq{X]^^ ■ ■ ■ X™* mod I) = -vq(X”i • • • X™* mod I) 

and (mi, ... ,mt) <t {ni, ... ,nt) with some total order <t on INq. <t satisfies 
following conditions: 

1. (mi, . . . , mt) <T (ni , . . . , nt) whenever mi > n\. 

2. If {mi, . . . ,mt) <T (ni,...,nt), then {mi, . . . ,mt) + {h, ■ . ■ ,k) <t 
(m, ... ,nt) + {h , ... ,lt) for all {h, ... ,k) G INg. 

Definition 4. LM denotes the leading monomial of a polynomial with respect 
to a monomial order. Let J C X[Xi, . . . ,Xt] he a nonzero ideal. The delta set 
A{J) of J with respect to the monomial order is 

A{J) := {{m , . . . , nt) G IN* I X"i • • • X"* ^ LM( J)}. 

If X = (m, . . . , nt), X^ denotes X"^ • • • X"*. For the delta set of an ideal 
J, the following is known. 

Proposition 6. [5, p.229] {X*^ mod J | X G A{J)} forms a K-hasis of 
X[Xi, . . . ,X(]/J as a K -vector space. 

The delta set of the defining ideal I of an algebraic curve in special position 
with respect to a place Q has nice properties. For simplicity hereafter we assume 
that vg{Xi mod /) 0 for each i. 

Definition 5. 

B{^) := {N G INll if L G IN(, and 

vq{X^ mod J) = vq{X^ mod I), then X^ X^}. 

For each 0 < i < —vq{Xi mod I) — 1, 

hi := min{j G —vq{£{ooQ) \ {0}) | j mod —vq{Xi mod I) = i}. 

T(^) := {X G B{^) I 3i, - vq{X^ mod I) = hi}. 

Note that if X, L G B{^) and N L, then vq{X^ mod I) vq{X^ mod I). 

This implies #T(^) = —vq{Xi mod I). 

The next proposition is a generalization of [19, Lemma 5.9 (2) and Lemma 
5.13 (1)]. 

Proposition 7. 

A{I) = B{<), 

B{-<) = {L+{n,0,... ,0 )\Lg T(^), n G INo}.n 
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Let NF(I) be the set of polynomials F G K[X\, . . . ^Xt] such that the 
remainder on division of F by a Grobner basis of / is F itself. An element 
/ G C{ooQ) is represented in a computer by a polynomial F G NF(/) such that 
F mod I = f. By Proposition 6 and 7, {X^ \ N G B{^)} is a F-basis of NF(/). 
If A"^A 2 ^ • • • A”* is the leading monomial of F G NF(J), then 

vq{F mod I) = -aiui atUt, 

because the lower terms of F with respect to the monomial order ^ have higher 
discrete valuations at Q by definition of F(^). This easy method for discrete 
valuation computation is crucial in Theorem 1. 



4 Fast Computation of Ideal Quotients 

In this section we show how we can efficiently compute various ideal oper- 
ations in C(ooQ). We retain notations from the previous section and define 
a := —vq{Xi mod/). To make computation most efficient, we have to make 
a(yf 0) as small as possible. 

4.1 Representation of Ideals 

Definition 6. For a nonzero ideal J C £{ooQ), we call Go, ■ ■ ■ , Ga-i G NF(/) 
a standard basis for J if: 

1. Go mod I, , Go -1 mod / belong to J. 

2. —VQ{Gi mod I) = min{j G —vq{J \ {0}) | j mod a = i}. 

Note that by definition Gi mod / yf 0 for i = 0, . . . , a — 1. 

This representation is convenient for computation of a basis of C{D). 

Theorem 1. Suppose that B is an effective divisor with vq{B) = 0 and 
Go, ■ ■ ■ ,Ga-i is a standard basis forC{—B + ooQ). Then a basis of C{—B + nQ) 
is 



{X\Gj mod I I vq{X\Gj mod I) > —n}. 

Proof. The assertion follows from Proposition 2. □ 

We describe how to compute Go,...,Ga-i from given Fi,...,Fg G 
K[Xi, . . . ,Xt] where Fi mod I, . . . , Fg mod / generate J. For simplicity we 
assume that none of Fi mod / is zero. 

Let To, ... , Ta_i G T(^) satisfy 

—vq{X'^' mod I) = minjj G —vq{L{coQ) \ {0}) | j mod a = i}. 

Then {X^Fj mod / |0<i<a — l,l<j<s} generates J as a K[Xi mod 
/]-module since {X^ mod /} generates £{ooQ) as a K[Xi mod /]-module by 
Proposition 7. Let {Hi} be the set of remainders on division of XxiFj by a 
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Grobner basis of / for 0 < i < a — 1 and 1 < j < s. Then the K-vector 
space generated by -ffi, ... , Hsa generates J as a, K[Xi mod /]-module and by 
Proposition 4, we can find Go, ... , Go-i from the itT-vector space generated by 
Hi, . . . , Hsa- We compute Go, . . . , Ga-i by Gaussian elimination as follows. 
Let {Bi, i? 2 , ■ • • } = L\(/) such that 

vq{X^' mod I) > vq(X^‘+i mod /), (1) 

and define the integer /i by the equation 

—vq{X^'^ mod I) = max{— UQ(iLi mod /) | i = 1, . . . , so}. 

Write each polynomial Hi as 

Hi = muX^’^ + + . . . + mifj.X^^ , 

for each i. Note that X^^ = 1. Gonsider the matrix (rriij). By elementary row 
operations, we can transform the matrix {niij) into a form such that for any two 
nonzero rows the columns of their left-most nonzero elements are different. Let 
(riij) be a transformed matrix of (jriij), and 

Since the leading monomials of Ek and Ei are different \ik ^ I, VQ{Ek mod I) yf 
vq{Ei mod I). 

Then {vQ{Ei mod /) | 1 < f < so} equals to vq{{Hi mod I,. . . , Hga mod /)) 
where (•) denotes the vector space generated by •. Thus we can choose 
Go,-. - , Ga-i as Gi = Ek ^ I where 

-VQ{Ek mod I) = 

{j e {-vq{Ei mod /),... ,VQ{Esa mod /)} | j mod a = i}. 

Since (mij) is a x sa matrix, the number of arithmetic operations in K 
required to compute from rriij is the order of 0(max{/r, sa}^), and 

fj, < max{—VQ{Hi mod I) | i = 1, . . . , sa} 

= max{—VQ{Ei mod I) | i = 1, . . . , s} -I- max{— mod I) | t = 1, . . . , a} 

These Go, ... , Go-i have a nice property, which is convenient for computation 
of an ideal quotient. 

Proposition 8. Let Q be a Grobner basis for I. Then {Go, . . . , Go-i} LI Q is a 
Grobner basis for I -T (Go,... ,Ga-i) where {•) denotes the ideal generated by 

□ 

If one does not need efficiency and want to reduce the effort required to 
implement our algorithm, there is an alternative approach to compute a standard 
basis for J. 

Proposition 9. Let Q be a Grobner basis for L -T {Ei,... ,Es). Then we can 
find all elements in a standard basis for J among { the remainders on division 
ofX'^H\T€T{^),HGg}. □ 
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4.2 Ideal Quotient 

Suppose that a standard basis of an ideal J\ C £(ooQ) is {Go,... ,Go-i}, 
and an ideal J 2 C £{ooQ) is generated by {Hi mod I,. . . ,Hb mod /}, where 
Hi e K[Xi, . . . ,Xt] for each i. We would like to compute a standard basis of 
Ji : J2- 

Obviously Ji Q Ji ■ J 2 - Let Ho, ■■ ■ , Ha-i be a standard basis of Ji : J2. Each 
Hi is determined by the following algorithm. Each element in B(-<) is indexed 
as Eq. (1). 

Algorithm 2. In this algorithm, variables are integers a, 7, a polynomial 
element-in-quotient C K[Xi, . . . , W], and a polynomial CEUididate G 
K{Pi, . . . ,/3q_i)[Ai, . . . ,Xt], where /3j is an indeterminate over K for each j. 

1. Let element-in-quotient = Gi. Find an integer a such that Ba G B{^) 

and = LM(Gi). If there is no such a, then Hi = Gi and the algorithm 

terminates. 

2. Let candidate = + ■ ■ ■ + j3\. Let Ej be the remainder 

on division of Hj x candidate by / + (Go, . . . ,Go-i). We view Ej as a 
polynomial in variable X\, . . . ,Xt over the coefficient field A(/3i, . . . , Pa-i)- 
Since the Grobner basis of /+ (Go, . . . , Ga-i) is contained in K[Xi , . . . , Xt], 
each coefficient of Ej is a A-linear combination of l,/3i, . . . 

Let (i5i, . . . , i5q_i) G The element in L{ooQ) represented by 

candidate with (/3i, . . . , j3a-i) replaced by (di, . . . , <5a-i) belongs to Ji : J 2 
if and only if Ej with (/3i, . . . , I3a-i) replaced by (<5i, . . . , i5a_i) is zero for 
j = 1,... ,5. Thus we consider the linear system of equations in variable 
/3i, . . . , f3a-i such that all coefficient of Ej is zero for j = 1, . . . , b. If the 
linear system of equations has no solution, then element-in-quotient has 
minimum pole order £ at Q among elements in J\ : J 2 such that the re- 
mainder of t' by a is i. Thus Hi = element-in-quotient, and the algorithm 
terminates. 

Else update element-in-quotient by candidate with (3\, ... , /3q_i sub- 
stituted by a solution of the linear system. Find the integer 7 as 

B^ = Ba - (1, 0 , . . . ,0), 

update 0 = 7 and repeat this process. If there is no such 7, then Hi = 
element-in-quotient and the algorithm terminates. 

The number of iteration in the algorithm above to compute each Hi is at 
most 



a + # LM((Eo, . . . , Ha-i) + 1) \ LM((Go, . . . , Ga-i) -I- 1) 
= a + #L\((Go, . . . , Ga-i) + I)\ 2\((Eb, . . . , Ha-i) + I) 

= a + dim(Ji : J 2 )/ Ji, 
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where (Ji : J 2 )/ Ji is the factor space of Ji : J 2 modulo Ji. If Ji = C{—A+ooQ) 
and J 2 = £{—B + ooQ) with divisors ^ B > 0, then 



dim(Ji : J 2 )/ Ji = £{B — A + ooQ)/£{—A + ooQ) (by Corollary 1) 

= dim £{ooQ) / £{— A + ooQ) — dim £{ooQ) / £{B — A + ooQ) 
= deg A — (deg A — degB) (by [7, Exercise 11.13]) 

= deg B. 



Remark 1. When one does not need efficiency, an ideal quotient can be computed 

in the standard way described in [5] . 
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Abstract. In this paper, the critical noise beyond which no convergence 
can be expected is determined for iterative decoding with belief propa- 
gation of binary linear block codes over the binary symmetric channel. 
This value is derived developing the self composition channel model first 
introduced for iterative a-posteriori-probability decoding. These results 
are then applied to the cryptanalysis of a keystream generator based on 
linear feedback shift registers. 



1 Introduction 

Iterative decoding is a powerful method for efficient decoding of certain block 
codes. A number of algorithms for iterative decoding of certain binary block 
codes have been developed and analyzed, noting that some of them are presented 
and considered in crypto oriented forms. 

Iterative decoding techniques originated from [6] where Gallager proposed 
two algorithms for decoding of his low density parity check (LDPC) codes. The 
first one is a simple flipping decoding approach based on the following: flip a bit 
if the majority of its checks indicates so. In this decoding scheme, the decoder 
computes all the parity-checks, and then changes any bit that is contained in 
more than some fixed number of unsatisfied parity-check equations. Using these 
new values, the parity checks are recomputed, and the process is repeated until 
either all parity checks are satisfied, or a predetermined number of iterations 
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is reached. Simple flipping iterative decoding approach has also been employed 
and analyzed in [1,19], and in crypto oriented forms in [21]. 

The second algorithm of [6] is based on the approach now known as Belief 
Propagation (BP) [16]. BP based iterative decoding of LDPC codes has been 
recently considered in a number of papers [5,9] . In [9] , the iterative decoding is 
based on the BP algorithm. For each bit, this algorithm iteratively updates the 
a-posteriori probability of error based on the results of the check sums intersect- 
ing on that bit and the a-priori probabilities of error associated with the bits 
contributing to these check sums. At the next iteration, these a-posteriori prob- 
abilities are used to re-evaluate all parity checks and become the new a-priori 
probabilities of error. The BP algorithm takes into account and cancels the cor- 
relations between probability values introduced by the iterative process. In fact, 
it would produce the exact posterior probabilities of all the bits if the bipartite 
graph defined by the parity check matrix of the code contained no cycles [16]. In 
[5] a reduced complexity BP based iterative decoding algorithm is proposed and 
applied for decoding LDPC codes. It achieves a good performance-complexity 
trade off. 

Finally, several results related to the decoding procedures based on 
a-posteriori probability (APP) threshold decoding [11] and the iterative principle 
presented in [3, pp. 152-153] have been reported. Essentially using the underly- 
ing ideas from [3,6], a number of iterative error-correction decoding algorithms 
have been developed and analyzed in [4,8,12,14,20], for example, noting that the 
algorithms in [4,12,14] are presented in crypto-oriented forms. These APP-based 
algorithms are simplier than, but not as efficient as the BP-based algorithms 
since they neglect the correlations between the values updated by the iterative 
method. Also an approach applicable for the previous decoding problems has 
been reported in [10]. This approach is based on approximating the posterior 
probabilities using a continuous optimization algorithm. As pointed out in [9], 
the employed iterative procedure is not optimal, but it is practical. 

Accordingly, reported iterative algorithms for decoding certain binary block 
codes could be classified into the following three classes: (i) simple flipping based 
iterative decoding, (ii) APP based iterative decoding, and (iii) BP based iterative 
decoding. One of the main issues regarding the iterative decoding is convergence 
through the iterations. The reported experimental results show that the itera- 
tive processing converges with high probability to the true solution, assuming 
sufficiently low noise in all three classes of algorithms (i) - (iii). For algorithms 
from the simple flipping class (i) some analytical results related to the conver- 
gence conditions for a successful iterative error-correction decoding have been 
reported in [6,19,21]. Some convergence analysis results related to class of APP 
algorithms (ii) are reported in [2,12,15]. Analytical convergence consideration of 
algorithms from BP class (iii) had been mainly based on the result from [16], 
or on consideration of some simple algorithms as reported in [6,9]. Recently, 
the convergence of BP based algorithms for LDPC codes has been analyzed in 
details for various channel models in [7,17]. 
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In this paper we consider the convergence of BP based iterative decoding 
algorithms in conjunction with a cryptographic application. To this end, the 
critical noise beyond which no convergence can be expected is determined for it- 
erative decoding with BP of binary linear block codes over the binary symmetric 
channel (BSC). This critical noise value is derived based on the self composi- 
tion channel model introduced in [15] for APP decoding. These results are then 
applied to the analysis of certain cryptographic pseudorandom bit generators 
(for stream ciphers) based on linear feedback shift registers (LFSR’s). The main 
crypto-goal of this paper is to point out the gain due to the replacement of 
an APP based decoding procedure by a BP based one assuming that all other 
parameters of the algorithm for the cryptanalysis are the same. 

2 Preliminaries 

We consider a binary linear (n, k) block code C with parity-check matrix H 
to be used on a BSC with crossover probability p. The effect of a BSC with 
error probability p is modeled by an n-dimensional binary random variable E 
defined over {0, 1}” with independent coordinates Ei such that Pr(£'i = 1) = p, 
i = 1,2,. . . ,n. Applying a codeword x = € C, to the input of the BSC, 

we get the random variable Y = E 0 x as a received word at its output. Let 
y = [yi](Li and e = [ei])L^ denote particular values of the random variables 
Y and E, respectively. Let denote an error probability vector, 

with coordinates in [0,1], after the j-th iteration step. More precisely, stands 
for the posterior probability of error for the i-th received bit after the j-th 
iteration step. Also, let denote the modified word after the j-th iteration 
step. In general, an iterative probabilistic decoding algorithm (IPDA) performs 
the following steps: 

1. Input: received codeword y. 

2. Initialization: set = p, i = 1,2, ... ,n, and y*^°) = y. 

3. Iterative probabilistic error- correction: for j = 1,2,... 

— compute as the vector of posterior error probabilities using 
as the vector of prior error probabilities (see equation (2) below), 

- if pI^^ > 0.5, then set and = 1-pI;^\ i=l,2,...,n. 

4. Output: estimated codeword x = yLm“). 

The posterior error probabilities are computed by using the appropriate 
parity-checks which correspond to the codewords from the dual code. We assume 
that for each bit, the parity-checks used are orthogonal on that bit, meaning that 
except for that bit, every other involved bit appears in exactly one of the parity- 
checks. For any i, i = 1,2, . . . ,n, let be the matrix whose rows are the dual 
code codewords corresponding to the chosen z-th set of parity-checks. Let Ji{w) 
denote the number of rows in containing exactly w-\- 1 ones. Given a received 
codeword y, let Si{w) denote the number of satisfied (zero- valued) parity-checks 
among the corresponding Ji (w) parity-checks each involving zc 0 1 bits and let 
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Si = [si(t<;)]”^\. Let Si = [S'i(w)]^^\ denote the corresponding random variable 
depending on the random variable E. 

For each i, i = 1, 2, . . . , n, let (?i(y) or, simply, qi denote the ratio of posterior 
error probabilities defined by 



Pr(Ei = 1 I H(*)E = H(*)y) 

1 - Pr(Ei = 1 I Hb)E = H(dy) ' 



Let p = [Pi]"=i be the vector of the prior error probabilities. Then for orthogonal 
parity-checks: 



q^ = 




(2) 



where for every codeword x' from 1 — 2p(x') = — 2pi) and the product 

is over all I i such that x[ = 1, and cr(y,x') is the value of the parity-check 
determined by x'. 



3 Convergence Analysis for Iterative APP Decoding 



The iterative APP decoding updates iteratively the values qi for each bit-i, i = 
1, 2, . . . , n. As a result, if qi,q 2 , - ■ ■ , q-n are the values gi’s computed at iteration- 
j, we simply substitute = {qi/{l + qi),q 2 /{l + 92 ), • ’ ’ , 9n/(l + <?«)) in 
the general algorithm presented in Section 2. For simplicity, suppose that the 
parity-check numbers Ji(w) are for each w and different i mutually equal, that 
is, Ji{w) = J{w), i = l,2,...,n, w = l,2,...,n— 1. This can be obtained 
by reducing the original numbers to their minimum value which then leads to 
a conservative estimate of the critical noise rate. In cryptographic applications 
where we deal with low-rate truncated cyclic codes this can be a very good 
approximation. Furthermore, assume that all parity checks considered have the 
same weight, so that J(w) = J. For the convergence analysis, the particular 
case when pi = p, i = 1,2, ... ,n, turns out to be of special importance [15]. It 
corresponds to the first iteration step of the algorithm. In this case, for Si = s. 



p ( i + {i-2pr \ 

l-p l^l-(l-2p)“j 



( 3 ) 



For a given value s, define the Bayes probability of error for each bit-f after the 
first iteration step of the decoding algorithm as 



Pb{s) = min{Pr(E, = 1 | 1 - = 1 | H^'V)}, 

= min{Pr(Ei = 1 | S'* = s), 1 - Pr{Ei = 1 | = s)}, 

^ r if ® < 1, 

1 if <?.>!• 



(4) 
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The average Bayes probability of error for each bit-i after the first iteration step 
of the decoding algorithm is given by 



PB,APp{J,w)=^PB{s)P{Si = s)=p- P{Si = s)^^, (5) 

where qi is given by (3), and 

P{Si = s)=p + (1 - P) (1 - PwYpi~‘' (6) 

with = (1 — (1 — 2p)“)/2. Note that if -Ps(s) = Pr(ifi = 1 | for all 

i = 1, • • • , n, then no a priori decision is modified and Pb,app{J, w) = p. As a, 
result, no convergence is possible. It follows from (5) that a necessary condition 
for convergence is that at least one qt > 1. As a result, (5) suggests an equiv- 
alent average BSC with crossover probability Pb,app{J,w) < p obtained from 
self-composition of the initial BSC. 

We define the critical noise value as the noise level associated with the 
largest crossover probability p so that there exists at least one qi > 1. This 
crossover probability is defined as the critical probability for iterative APP 
decoding, and represented by Pcrit,APP- It is important to notice that this defi- 
nition simply implies that after iteration-1, an average probability Pb,app{J, w) 
smaller than p is achieved, but without guarantee that the iterative algorithm 
will converge to a zero error probability with a sufficient number of iterations. 
As a result, this definition differs from that of [17] where convergence to an error 
free channel is considered. In other words, in this paper, the critical probability 
is defined as the largest p for which a better channel than the original one can 
be obtained, while in [17], the critical probability is defined as the largest p for 
which an error free channel can be obtained. Our definition can be justified by a 
crypto oriented motivation as even with residual errors, an information set de- 
coding approach can complete the iterative decoding, resulting in an unsecured 
crypto system. 

4 Convergence Analysis for Iterative BP Decoding 

The iterative decoding based on BP updates J -I- 1 values for each bit-t, i = 
1, 2, . . . , n [9,16]. As for the iterative APP decoding, for each bit-f, the a-posteriori 
values qi evaluated from the J check sums intersecting on bit-i are updated it- 
eratively. However, for each check sum-/, I = 1, 2, . . . , J intersecting on bit-i, J 
values qi^i are also updated iteratively. The value qi^i is evaluated via the J — 1 
check sums other than check sum-/ intersecting on bit-/. Therefore, the value 
qi^i corresponds to the probability that bit-/ is in error, given the information 
obtained via checks other than check-/. At iteration-j, the decision about bit-/ 
is made based on qi, which has been computed from the values qi/s evaluated 
at iteration-(j — 1). As a result, the BSC obtained by self-composition of the 
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original BSC and corresponding to the values qi/s has to be considered after 
iteration- 1. 

To evaluate the value qi^i, the result of check sum-^ is discarded. For s < J, 
define c(s) as the event that an unsatisfied check sum-? is discarded and for 
s > 0, define c(s — 1) as the event that a satisfied check sum-/ is discarded. Let 
C represent the corresponding random variable. It follows that for each bit-/ and 
each value of s, 0 < s < J, the ratio of posterior error probabilities 



Pr{Ei = 1 I S,,C) 

1 - Pi{Ei = 1 I C) 



( 7 ) 



can take one of the two possible following values 



_ PrjEj = 1 I = s, (7 = c(s)) 

^ 1 - Pr{Ei = 1 \ Si = s,C = c(s)) 

Pr{Ej = I \ Sj = s,C = c{s - 1)) 

1 - Pr{Ei = 1 \ Si = s,C = c{s - 1)) 



p / i + (i-2pr y-^-^; 
l-p Vl-(l-2p)“y' 
p / i + (i-2pr y+^-^; 



Note that (8) is defined for s < J, while (9) is defined for s > 0. For a given 
value Si = s and a given value C = c, define the Bayes probability of error for 
each bit-/ after the first iteration step of the decoding algorithm as 



Pb{s,c) = min{Pr(ifi = 1 | S'* = s, C = c), 1 - Pr{Ei 

^ r if < 1, 

1 if ft > 1, 



= s,C=c)}, 

(10) 



where qi is defined in (7). 

The average Bayes probability of error for each bit-/ after the first iteration 
step of the decoding algorithm is given by 



Pb,bp{J, w) = ■£Y.Pb(‘PPS. = s.C = c}. 

s c 



= X! ^ (j ~ ~j~ ’ 



= p- 



E J — S 

, J 

s: qi,i>l 



P{Si 



s) 



ft,i - 1 
ft,i + 1 



+ E 

s: gt ,2 >1 



s) 



ft, 2 - 1 
ft, 2 + 1 



( 11 ) 



As for (5), a necessary condition for convergence is that for j = 1, 2, there exists 
at least one qij > 1, so that Pb,bp{J,w) < p in (11). 

In [15], it is shown that the necessary condition derived in Section 3 is also 
sufficient for the self composition model 



pU) _ pU~P 



j(pO-D). 



(12) 
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Unfortunately, this result no longer applies for the self composition model 

i^(^') = P,(°)-/(P(^-i)), (13) 

which has to be considered for iterative BP decoding. This fact also explains the 
difference between our definition of the critical noise value and that of [17]. Note 
finally that (12) and (13) are equivalent for j = 1. 

As in Section 3, (11) is associated with a critical noise value and a cor- 
responding critical probability Pcrit,BP for iterative decoding based on BP. In 
fact, the following theorem shows that (11) can be simply obtained from (5) by 
considering J — 1 check sums of weight w -I- 1 each, as expected. 

Theorem 1. Let Pb,app{J,w) and Pb,bp{J,w) represent the average Bayes 
probabilities of error for each bit-i after the first iteration step of iterative APP 
and BP decodings, respectively. Then 

Pb,bp{J,w) = Pb,app{J — l,ic)- (14) 

Proof. Let si < J — 1 represent the largest value of s such that qi^i > 1 in (11) 
and define for 0 < s < si, f{J,s) = {qi^i — l)/{qi^i + 1). It follows that for 
1 < s < Si -I- 1, {qi ,2 — l)/(?i ,2 + 1) = f{J, s — 1) and (11) can be rewritten as 

( SI Sl+1 

^ P{Si = s) f{J, P{Si = s) /( J, s - 1) 

s^O s=l 

Si 

= = s) + ^P{S. = s + 1)) /(J, s). (15) 

s-0 

After some algebra, we obtain 

(^) P{Si = S)+ P{S, = s + 1) 

= p(^^ g + (^ - p) g ^^{^-PwYpi~^~'' (16) 

By comparing (15) and (16) with (5) and (6), we conclude that Pb,bp{J,w) is 
equal to the value obtained in (5) by considering J — 1 check sums of weight 
■u; -|- 1 for each bit, which completes the proof. 

Theorem 1 suggests that iterative APP decoding converges faster than itera- 
tive decoding based on BP. However, based on the model considered, we have 
no guarantee that convergence to the correct solution is achieved. Note finally 
that both iterative APP and iterative BP decodings yield the same decision 
at iteration- 1, but different a-priori probabilities are considered to evaluate the 
a-posteriori probabilities which determine the decisions at subsequent iterations. 
From (14), we readily conclude that 



Pcrit.BP ^ Peril, APP • 



(17) 
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Since iteration- 1 is common to any class of iterative decoding algorithm for the 
BSC model, Pcrit,APP can be viewed as an upper bound on the critical probability 
of such decoding methods. Note that this upper bound remains valid for the 
stronger convergence considered in [17]. Also, since the model considered for 
self concatenation in Sections 3 and 4 implicitly assumes a bit-flipping strategy, 
Pcrit,BP can be interpreted as a more realistic estimate for practical BP based 
iterative decoding schemes. Note that Pcrit,BP corresponds to an instance of the 
model considered in Example 5 of [17]. The values Pcrit,APP and Pcrit,BP for 
different values J and w are given in Table 1. As expected, we observe that the 



Table 1. Values Pcrit,APP and Pcrit,BP for different values J and w. 



J 


w -1- 1 


Peril, APP 


Peril, BP 


3 


6 


0.091 


0.039 


4 


8 


0.080 


0.055 


5 


10 


0.071 


0.057 


3 


5 


0.127 


0.061 


4 


6 


0.124 


0.091 


3 


4 


0.191 


0.107 



values Pcrit,APP slightly overestimate the corresponding values derived in Table 4 
of [17], while the values Pcrit,BP are close to the values derived in Table 2 of [17]. 

5 Cryptanalysis of a Keystream Generator with LFSR’s 

In this section, we apply the previously obtained results for improving the re- 
ported results regarding the analysis of a class of cryptographic pseudorandom 
bit generators for stream ciphering systems (see [13] , pp. 191-197 and 205-207, for 
example) . A number of the published keystream generators are based on binary 
LFSR’s assuming that parts of the secret key are used to load the LFSR’s ini- 
tial states. The unpredictability request, which is one of the main cryptographic 
requests, implies that the linearity inherent in LFSR’s should not be “visible” 
in the generator output. One general technique for destroying the linearity is 
to use several LFSR’s which run in parallel, and to generate the keystream as 
a nonlinear function of the outputs of the component LFSR’s. Such keystream 
generators are called nonlinear combination generators. 

A central weakness of a nonlinear combination keystream generator is demon- 
strated in [18]. Assuming certain nonlinear functions it is shown in [18] that there 
is possible to reconstruct independently initial states of the LFSR’s, i.e. parts 
of the secret key (and accordingly the whole secret as well) based on the cor- 
relation between the keystream generator output and the output of each of the 
LFSR’s. The reported approach is based on exhaustive search through all pos- 
sible nonzero initial states of each LFSR. Due to the exponential complexity of 
this approach it is not feasible when the lengths of employed LFSR’s are suf- 
ficiently long. Substantial improvements of the previous approach which yield 
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complexity linear with the LFSR length are proposed in [12] and [21]. Further 
extensions and refinements of this approach called fast correlation attack, as well 
as its analysis are presented in a number of papers including [2,4,10,14]. 

5.1 Fast Correlation Attack and Decoding 

The problem of the LFSR initial state reconstruction based on the keystream 
generator output sequence can be considered as decoding a punctured simplex 
code after a BSC with crossover probability uniquely determined by the corre- 
lation between the generator output and the component LFSR output. 

This correlation means that the modulo-2 sum of the corresponding output 
of the LFSR and the generator output can be considered as a realization of a 
binary random variable which takes value 0 and 1 with the probabilities 1 — p 
and p, respectively, p yf 0.5. Accordingly, the problem of the LFSR initial state 
reconstruction given the segment of the generator output can be considered as 
follows: (1) The n-bit segment of output sequence from the fc-length LSFR is 
a codeword of an (n, k) punctured simplex code; and (2) The corresponding n- 
bit segment of the nonlinear combination generator output is the corresponding 
noisy codeword obtained through BSC with crossover probability p. Also, fol- 
lowing the approaches from [18] and [12], note that, if the ciphertext only attack 
is considered, then the influence of the plaintext in the previous model yields 
only some increase of the parameter p. 

The main underlying ideas for the fast correlation attacks are based on the 
iterative decoding principle introduced in [6] . Accordingly, all the fast correlation 
attacks mentioned in the previous section could be considered as variants of 
iterative decoding based on either simple flipping or extensions of the well known 
APP decoding [11]. Due to the established advantages of BP based iterative 
decoding over iterative APP, the main objective of this section is to report 
results of applications of BP based iterative decoding for realizations of the fast 
correlation attack. 

5.2 Belief Propagation Based Fast Correlation Attack 

In this section, the advantages of BP based correlation attacks with respect 
to APP based ones are verified by simulation. We consider the decoding of the 
punctured simplex code of length n = 980 and dimension /c = 49 defined from its 
cyclic dual code generated by the primitive polynomial 1 -I- A® -|- . It follows 

that w-|- 1 = 3, while J varies between 5 and 12 depending on the position of the 
bit considered. We consider the BSC associated with BPSK transmission over an 
additive white Gaussian noise (AWGN) channel and hard decision decoding of 
the received values, so that p = Q(A/2iff,/Ao), where Eb and Nq/2 represent the 
average energy per transmitted BPSK signal and the variance of the AWGN, 
respectively. Note that since the keystream sequence is “superposed” to the 
plaintext sequence during encryption, no redundancy is added to the message 
text and therefore, no normalization of the transmitted average energy with 
respect to the channel code rate is required, as opposed to conventional channel 
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coding schemes. 

Figures 1 and 2 depict the simulation results for iterative APP decoding 




Fig. 1. Iterative APP decoding of the (980,49) truncated simplex code. 




Fig. 2. Iterative BP decoding of the (980,49) truncated simplex code. 

and BP decoding, respectively. In both cases, a maximum of 20 iterations was 
performed. In these figures, the word (or key) error probability is represented as 
a function of the SNR Et/No in dB. Each key of 49 bits is encoded in systematic 
form into a codeword of length 980, and at the receiver, the key is retrieved 
from the decoded codeword based on the systematic encoding. Indeed, the error 
performance could be further improved by information set decoding methods, 
but such improvements are beyond the scope of this paper. Based on Figures 1 
and 2, we observe that the iterative BP algorithm converges slower than the 
iterative APP algorithm, but achieves a much better error performance. 

Figure 3 compares APP and BP iterative decodings after 20 iterations at SNR 
values for which the key starts having a non-zero probability of being recovered. 
We observe that this event occurs for SNR « -15 dB for BP decoding and SNR « 
-12 dB for APP decoding. The corresponding crossover probabilities are p « 0.40 
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and p « 0.36, respectively. For ic + 1 = 3, we compute Pcrit,BP = 0.454 from 
(11) and Pcrit,APP = 0.458 from (5) for J = 12, and Pcrit,BP = 0.372 and 
Peril, APP = 0.399 for J = 5. Similar results were observed for n = 490, in which 
case J varies between 4 and 9. For w + 1 = 3, we compute Pcrit,BP = 0.437 
and Peril, APP = 0.444 for J = 9, and Perii,BP = 0.327 and Pcrit,APP = 0.372 
for J = 4, while from simulations, we record p « 0.36 for iterative BP decoding 
and p « 0.33 for iterative APP decoding. In both cases, the theoretical values 
provide a relatively good estimate of the critical probabilities for the iterative 
BP decoding algorithm. 
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Fig. 3. Iterative APP and BP decodings of the (980,49) code after 20 iterations. 



6 Concluding Remarks 

In this paper, characteristics of the BP based iterative decoding assuming very 
high noise have been considered. In particular, the expected noise value beyond 
which the iterative BP decoding is not feasible for the BSC model considered 
has been derived. The established results have been applied to improve a crypt- 
analytic technique: the fast correlation attack. We have showed that under the 
same conditions of realization for fast correlation attacks, BP based iterative 
decoding provides significant improvements upon APP based iterative decoding. 
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Abstract. It is necessary to authenticate the messages over an inse- 
cure and non-authentic public channel in information-theoretic secret- 
key agreement. For a scenario where all three parties receive the output 
of a binary symmetric source over independent binary symmetric chan- 
nels as their initial information, an authentication scheme is proposed 
based on coding theory, which uses the correlated strings between the 
two communicants to authenticate the messages over the public channel. 
How to select coding parameters to meet the safety requirements of the 
authentication scheme is described. This paper also illustrates with an 
example that when the adversary’s channel is noisier than the commu- 
nicants’ channels during the initialization phase, such an authentication 
scheme always exists, and the lower bound of the length of authenticators 
is closely related to the safety parameters, code rate of the authentica- 
tion scheme and bit error probabilities of independent noise channels in 
the initialization phase. 



1 Introduction 

In the past few years, the information-theoretic secret-key agreement protocols 
[1] [2] [4] [5] [6] [7] secure against adversaries with infinite computing power have 
attracted much attention, for those protocols discard the unproven assumptions 
on the hardness of certain computational problems such as the discrete loga- 
rithm or the integer factoring problem which are essential to public-key cryp- 
tographic protocols. Information-theoretic secret-key agreement over authentic 
public channel takes place in a scenario where two parties Alice and Bob who 
want to generate a secret key, have access to random variables X and Y, respec- 
tively, whereas the adversary Eve knows a random variable Z. The three random 
variables A, Y, and Z are distributed according to distribution Pxyz, and a 
key agreement protocol consists of three phases: Advantage distillation, Alice 
and Bob use their advantage over Eve offered by the authenticity of the public 
channel to generate an advantage over Eve in terms of their knowledge about 
each other’s information; Information reconciliation, Alice and Bob agree on a 

* This work was supported by National Natural Science Foundation of China. 
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mutual string S by using error-correction techniques; Privacy ampUGcation, the 
partially secret S is transformed into a shorter, highly secret string S . Bennett 
et.al. [6] have shown that the length of S can be nearly the Renyi entropy of S 
when given Eve’s complete knowledge Z = z and the entire communication held 
over the public channel. 

Most of the protocols assume the existence of an insecure and authentic 
channel[2] [4] [5] [6] [7] , which means that Eve can eavesdrop on the communication 
between Alice and Bob, but she can not modify or introduce fraudulent messages 
over the channel without detection. 

However, the existing public channels are usually non-authentic as well as 
insecure. In other words. Eve can see every message and replace it by an ar- 
bitrary message of her choice, and she even may impersonate either party by 
fraudulently initiating a protocol execution. So it is necessary to authenticate 
the public discussions in a secret key agreement over a non-authentic channel. 
For a scenario where all three parties receive the output of a binary symmetric 
source over independent binary symmetric channels as their initial information, 
an authentication scheme is proposed based on coding theory, which uses the 
correlated strings between the two communicants for authentication and makes 
information-theoretic secret- key agreement against active adversaries possible. 

2 Secret-Key Agreement and the Scenario Employed 

Generally, a key-agreement protocol consists of three phases[l]: 

~ An initialization phase in which Alice, Bob and Eve receive random variables 
X,Y and Z, respectively, which are jointly distributed according to some 
probability distribution Pxyz- 

— During the communication phase Alice and Bob alternate sending each other 
messages Mi, M 2 , ■ ■ ■ , Mt, Where Alice sends Mi, M3 ,... and Bob sends 
M 2 , Mi , .... Each message depends possibly on the sender’s entire view of 
the protocol at the time it is sent and possibly on privately generated random 
bits. Let t be the total number of messages and let Mt = [Mi, . . . , Mt] denote 
the set of exchanged messages. 

— Finally, Alice and Bob each either accepts or rejects the protocol execution, 
depending on whether they believe to be able to generate a secret key. If 
Alice accepts, she generates a key S depending on her view of their protocol. 
Similarly, if Bob accepts, he generates a key S depending on his view of the 
protocol (maybe with the help of privacy amplification techniques). 

We consider the following scenario(see Figure 1) which is inspired by[l] in 
this paper. 

1. Initialization phase: A source (maybe a satellite) broadcasts random bits 
[/" = (C/q, Ui, . . . , Un-i) through three independent binary symmetric chan- 
nels Cai Cb and Ce with bit error probabilities ca, cb and to Alice, Bob 
and Eve, and they receive random variable X" = (Xq: -^ 1, • ■ • , X„-i), F” = 
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(Yo, Yi, . . . , and Z” = (Zq, Zi, . . . , Z„_i) respectively, which means 

Pui(0) = P(7i(l) = 0.5, PxiYiZi\Ui = Pxi\Ui' PviiUi- Pzi\Ui, t = 0, 1, . . . ,n-l, 

n— 1 

a,ndPx^Y^z^{xo, . . . ,Xn-i,yo, • ■ ■ = H PxY z{xi, Vi, Zi) 

i^O 

2. Communication phase: Alice and Bob interchange messages over a public 
channel. We assume that the public channel is an ideal noiseless channel, 
but it is insecure and non-authentic. 

3. Decision phase: Alice and Bob each either accepts or rejects the protocol 
execution. If both of them accept the results, they generate the final secret 
key. 



(Alice) 







(Bob) 



\Z" 
(EVe) 



Public Channel 



Fig. 1. The scenario in the information-theoretic secret-key agreement. 



The above scenario is a special case of the general key agreement protocol, 
which is well motivated by models such as discrete memoryless sources and 
channels previously considered in information theory. Such a scenario is relatively 
easy to analyze, and its result will be helpful to ongoing research. 

It is shown in [2] that such a secret key agreement over authentic public 
channel is possible even when Eve’s channel is superior to the other two channels, 
i.e. CA > ce or cb > ce- But M. Maurer proved recently that such a secret key 
agreement over non-authentic public channel is only possible when Eve’s channel 
is inferior to the other two channels [1]. For this reason we assume that ca < ce 
and Cb < ce in this paper. 



3 The Authentication Scheme 

Some concepts and facts about coding theory are introduced first. 
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Definition 1. [1] 0-1 distance from a codeword ci to a codeword C 2 , denoted 
by d(ci ^ C 2 ), is defined as the number of transitions from 0 to 1 when going 
from Cl to C 2 , not counting the transitions from 1 to 0. 

The 0-1 distance of two codewords is different from the Hamming distance 
and it is not symmetric, i.e. d{ci C 2 ) yf d{c 2 ci) in general. 

Definition 2. The minimum 0-1 distance of a code C, denoted by do^i(C), is 
defined as the smallest value among the distances of any two different codewords 
in C, i.e. do^i(C) = min d{ci Cj), Ci,cj G C. 

The minimum 0-1 distance of any conventional linear code is 0 for the exis- 
tence of zero-code. 

Lemma 1. Every conventional linear code of length n with minimum Hamming 
distance d can be converted to a code of length 2n with minimum 0-1 distance 
d by replacing every bit in the original code by pair of bits, namely by replacing 
0 by 01 and 1 by 10. 

The authentication scheme is described as follows: 

Prerequisite: Alice, Bob and Eve obtain initial information {Xq,Xi, . . .), 

{Yq,Yi, . . .) and {Zq, Zi, . . .) over independent binary symmetric channels 
with bit error probabilities ca, £b and ce respectively, where tA < and 
es < iE- 

Safety parameters: s,s >1. 

Authentication performance: The receiver accepts the sender’s legitimate 
messages with probability at least 1 — 1/s^ while rejects Eve’s fraudulent 
messages with probability at least 1 — 1/s 
Protocol: 1. Let 6ab = + sb — 2e^es, sbe = €b + 6e — 2eB£E, ^ae = 

^A + Ce — 2eAiE- Choose N and d from a set {N, d) satisfying 

NAi > [s (1 - €AB - min(esE, e^£;))/2]^ (1) 

s\/ NAi + s \/ {N — d)Ai + dA 2 < d{eBE — £ab) (2) 

and 

sa/ NAi + s \/(-^ - d)Ai + dA^ < d{tAE ~ ^ab) (3) 

where Z\i = eAs(l - cab), L\2 = es£;(l - <^ be ) and A3 = eAB(l - (- ae ), 
and make sure that a {N, K, d) linear code C exists for some integer K. 
It should be noted that if N is large enough, such a {N, K, d) linear code 
always exists, a fact that will be shown in the next section. Parameters 
N, K, d and corresponding coding rules are public. 

2. Convert the {N, K, d) linear code C to the corresponding 0-1 code C of 
length 2N with minimum 0-1 distance d, using the method provided by 
lemma 1. 

3. Every time the sender sends a A-bit message to the receiver, she appends 
a A-bit authenticator to the message. Without loss of generality, we as- 
sume that Alice wants to send a message M = (Mq, Mi, . . . , Mk-i) to 
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Bob. The authenticator must be chosen according to the following rules: 
First of all, Alice encodes the message and gets the linear codeword 
C = (Co, Cl, ... , Cat_i) according to the coding rules of the (TV, K, d) 
linear code. Then she finds out the corresponding 0-1 codeword C = 
(Cq, Cl, . . . , C 2 JV- 1 ) from C'. If = C-^ = . . . = C-^ = 1 in the code- 
word, then she selects . . . ,Xi^) from (Aq, A 2 , . . . , A 2 at-i) as 

the authenticator. Finally she appends (A^j, A^^, . . . ,Xi^) to (Mq,Mi, 

. . . , Mk-i) and sends it to Bob. 

4. When Bob receives the message, he also gets the corresponding 0-1 code- 

word C = (Cg, Cl, . . . , C 2 iv_i) from (Mg, Mi, . . . , Mk-i)- Similarly, he 
determines . . . ,Yi^) from (Cg, Ci, . . . , C 2 iv-i)- Then he com- 
pares (Fi j , Fi 2 ) • ■ • ) ) with the authenticator (A^^ , Xi^ , ■ ■ ■ , A^„ ), if the 

number of different bits, say x, is less than Ncab + s^JNeAB{^ — ^ab), 
he accepts the message or he rejects it. 

5. (Aij , Ai 2 , . . . , Ai„ ) and {Yi^ ,Yi^,. . . , Yi ^ ) should be discarded from the 
initial correlated strings after each message transmission, and never used 
again for any purpose. 

Remark. The required authentication performance can be accomplished by choos- 
ing proper safety parameters. 

Theorem 2. The above authentication scheme ensures that the receiver accepts 
sender’s legitimate messages with probability at least 1 — 1/s^ while rejects Eve’s 
fraudulent messages with probability at least 1 — 1/s 

Proof. Let €ab = — ‘^^a^b be the bit error probability between cor- 

responding bits of Alice’s and Bob’s strings, and €ae = f-A + f-E — ‘2-tA^E and 
cbb = es + CB — 2ese£; be the bit error probabilities between corresponding bits 
of Alice’s and Eve’s and between Bob’s and Eve’s strings, respectively. We still 
assume Alice is the sender(When Bob is the sender, tBE should be changed into 
€AE in the following proof). 

If the message sent by Alice is not modified by Eve, the subscripts of {Yi ^ , Yi.^ , 

. . . , Yi^ ) determined by Bob will be consistent with those of ( Aj^ , A^ 2 , . ■ ■ , Xi^ ) 
received by Bob since the public channel is noiseless. Let x denote the number 
of different bits between them, the expected value and the standard deviation 
of X are 

H = Ncab 

and 

CT = \/ NeAB{f — <^ab)- 

Since Bob accepts message only when x < Ncab + S\/Ae^B(l — cab), we get 
the following result from the Chebyshev inequality 

Pr { |x — ^1 < sct} > 1 — cr^/ (scr)^ 

Pr{|a; - Ncab\ < s\/AeAB(l - £ab) > 1 - 1/s^ 
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Pr{x < NtAB + sa/ N€ab{1 - (ab) > 1 - 1/s^, 

which means that Bob accepts legitimate messages with probability at least 
1 — 1/s^. 

When Eve has intercepted a message together with its authenticator 
(Mo , Ml , . . . , Mk- 1 ) 1 1 {Xi^ , Xi^ , ■ ■ • , Xi^'j , her best strategy for creating a new 
authenticator for a different message M = (Mq, Mi, . . . , Mj^_^) (hoping that it 
will be accepted by Bob) is to copy those bits from the received authenticator 
that are also contained in the new authenticator and to take as guesses for 
the remaining bits her copies of the bits in (Zq, Z 2 , ■ ■ ■ , Z 2 N- 1 ), introducing bit 
errors in those bits with probability cbe- The maximal probability of successful 
deception is hence determined by the number I of bits that Eve must guess 
and the total number N of bits in the forged authenticator. When Eve tries to 
deceive Bob, the expected value and the standard deviation of the bits in the 
forged authenticator that disagree with Bob’s corresponding bits are 

^ = {N — l)eAB + l^BE 

and 

CT = \/{N — l)eAB{^ — sab) + 1^be{^ — (-be) = \/{X — l)Ai + /Z\2, 

where A\ = e^s(l — €ab) and A 2 = €be{^ — ^be)- In fact, the 0-1 distance from 
a codeword C\ to a codeword C 2 is the number of bits that Eve must guess when 
trying to convert the authenticator corresponding to C\ into the authenticator 
corresponding to C 2 - Since the minimum 0-1 distance of the code C is d, we 
obtain I > d, which means that Eve must guess at least d bits to forge the 
authenticator. It is easy to prove that if 

NAi > [s (1 — €ab — es£;)/2]^, 

the derivative function of f{x), 

f{x) = {€be - ^ab)x - s^/NAi - s ^/{N -x)Ai + XA 2 , 

is non-negative, i.e. / (x) > 0, when x > 0. So we get f{l) > f{d). The scheme 
assumes that 

d{tBE — (ab) > sa/ NAi + s \J {N — d)Ai + dA2 

holds, so 

1{^BE — Sab) > sa/ NAi + s \J (TV — l)Ai + IA 2 

also holds, which means /x — ^ > sa + s a . Let x still denote the number of 
different bits between the authenticator received by Eve and that forged by Eve. 
From the Chebyshev inequality we know 

Pr{|x — ^ I <scr}>l — cr^/(scr)^ Pr{x>/x — sct}> 1 — 1/s^. 

Since /i —sa > /x -|- scr, we get Pr {x > /i -I- scr} > 1 — 1/s i.e. 

Pr{x > NtAB + Sa/ A^£ab(1 ~ sab) > 1 - 1/s 

Therefore Bob rejects fraudulent messages with probability at least I — l/s 
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4 The Existence of the Authentication Scheme 

In this section, we shows with an example that as long as and es < e_E, 

it is possible to find a proper linear code to implement the above scheme and 
accomplish the required authentication performance. 

We take extended Reed-Solomon codes over a finite field GF{2^) [3] as an 
example to illustrate that it is possible to find a proper linear code to implement 
the authentication scheme and accomplish the required authentication perfor- 
mance as long as < ce and €b < ce- Let = 2’’ be the code length, and let 
information digit K = c ■ N where 0 < c < 1, then the minimum Hamming dis- 
tance is d= (1 — c)'iV-|-l. When the extended Reed-Solomon code is converted 
to the 0-1 code, we know that the minimum 0-1 distance is still d. Substituting 
d = (1 — c) • iV -I- 1 to the inequality (2), we obtain 

((1 — c)N + 1) • (esE — (ab) > 

sa / NAi + s ^ A^[cZ\i + (1 — c)Z\ 2 ] -I- Z \2 — Z\i (4) 

It is obvious that there exists a Aq to make both (1) and (4) hold for all integers 
N, N > Nq. Therefore, we can always find an extended RS code {N,cN, (1 — 
c)N + 1) to implement the authentication scheme, and the code rate is K/{K + 
N) = c/(c -I- l)(see Figure 2). 




Safety parameters: s,s (s=s) 

Fig. 2. The lower bound of length of authenticators Nq as function of safety 
parameters s, s for code rate K/{K + N) = 1/3 and 1/9 with the bit error rate 
€A,eB,eE = 0.01,0.02,0.3 and 0.1,0.02,0.3. 



We assume that s = s for simplicity. The dashed lines denote the case 
when €A = 0.01, es = 0.02 and €e = 0.3 while the solid lines denote the case 
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when CA = 0.1, es = 0.02 and €e = 0.3. It is evident that the lower bound of 
authenticators Nq grows with increasing s, s , decreases with decreasing code 
rate K/{K + N), and the larger ceI^a or €e/£b, the smaller A^o is. 

5 Conclusion 

When two communicants and an adversary obtain correlated information through 
independent binary symmetric channels from a random source, and the ad- 
versary’s channel is noisier than those of communicants, information-theoretic 
secret-key agreement secure against active adversaries is always possible since 
an authentication scheme based on coding theory can always be implemented 
at the required safety level with the help of correlated strings between the two 
communicants. The authentication scheme based on extended RS code is simu- 
lated and the result shows that the lower bound of the length of authenticator 
is closely related to safety parameters, code rate and the bit error rates of the 
channels. 

Although a linear code satisfying the requirements can always be found, the 
scheme proposed in this paper may be not very practical since the authenticator 
may be too long or the code rate may be too low. So how to design a practical 
authentication scheme with high code rate and moderate authenticator length 
remains an open problem. 
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Abstract. Since before we have proposed a systolic array architecture 
for implementing fast decoding algorithm of one point AG codes. In 
this paper we propose a revised architecture which is as its main frame- 
work a one-dimensional systolic array, in details, composed of a three- 
dimensional arrangement of processing units called cells, and present a 
method of complete scheduling on it, where not only our scheme has 
linear time complexity but also it satisfies restriction to local commu- 
nication between nearest cells so that transmission delay is drastically 
reduced. 



1 Introduction 

One-point codes from algebraic curves are a class of algebraic geometry (AG) 
codes which are important because not only they have potentially better perfor- 
mance for longer code lengths than the conventional codes such as BCH codes 
and RS codes, but also they can be decoded efficiently. We have given several 
versions [1] [2] [3] [4] of fast decoding methods for them based on the Berlekamp- 
Massey-Sakata (BMS) algorithm [5] [6]. As the second step toward practical use 
of one-point AG codes, we must devise efficient hardware implementation of the 
decoding method. In the setting, the vector version [4] of decoding method is 
suitable for parallel processing by nature. In this paper, we propose a special 
kind of systolic array architecture for its parallel implementation. This is a re- 
vision of our papers [7] [8] partly presented at ISIT-95 and at ISIT-97 as well as 
a result of these previous trials. 

Kotter [9] [10] gave a parallel architecture of his fast decoding method, which 
has a form nearer to the one-dimensional Berlekamp-Massey (BM) algorithm in 
comparison with our decoding method relying deeply on the multidimensional 
BMS algorithm. His architecture consisting of a set of feedback shift registers 
is an extension of Blahut’s implementation [11] of the BM algorithm. The shift 
registers have nonlocal links among delay units, which are not desirable because 
they give rise to long delay in transmitting data necessary for computations of 
the BM algorithm. 
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Instead of shift register, we propose a systolic array architecture having only 
local links between neighboring cells (processing units). The main framework 
of this systolic array is one-dimensional, and it consists of a series of pipelined 
processors along which all data are conveyed. But, since each processor is com- 
posed of a two-dimensional arrangement of cells, the detailed structure is three- 
dimensional. We assume that an array of size m is given as a set of input data, 
which is usually a syndrome array obtained from a given received word, where 
the integer m is comparable to code length. The input data are fed component 
by component into the cells constituting the first or leftmost processor of our 
systolic array. Then, all the relevant data are processed and transmitted com- 
ponent by component from the cells constituting each processor to those of its 
right-neighboring processor, and so on. As the output from the cells of the last 
or the rightmost processor, we can get the necessary data for decoding, in par- 
ticular, the coefficients of polynomials composing a basis of the module of error 
locator functions, whose common zeros coincide with the error locators. 

Our main concern is in how to give a complete schedule of all the operations 
of cells constituting our systolic array, where we assume that each cell can com- 
pute by itself a small piece of intermediate values necessary for decoding and 
communicate the relevant data with its neighboring cells. The important issues 
are synchronization of all the operations of cells and local communication of the 
data through the links existing only between nearest cells (without any nonlocal 
feedback link). 

This is an extension of our previous work [12] on a systolic array architecture 
for implementing the BM algorithm of decoding RS codes. The present situation 
is more complex because of the multidimensional character of our fast decoding 
algorithm for one-point AG codes. 

By this scheme we can get much reduction in time complexity of decoding 
with some amount of space complexity, i.e. a number of cells, compared with the 
serial scheme. Furthermore, our parallel scheme is more efficient than Kotter’s, 
in particular, w.r.t. time complexity, where ours 0{m) is much better than his 
0{m?) for the input data size m. 

2 Preliminaries 

In this paper we use the same terminologies as in the reference [4] except for 
some modifications and additions. As preliminaries, we give a brief sketch of 
concepts and symbols appearing in the subsequent sections together with short 
descriptions of correspondences between the present and the previous symbols. 
Some important symbols used in [4] and their correspondences to the present 
ones are explained in brackets [ ]. 

A one-point code from an algebraic curve X over a finite field K = Fq is 
defined from a prescribed point Q on the curve X, the set V of AT-rational 
points excluding Q, an affine ring K[X] = K[(j)i, • • • , <Pn]/ I and a nonnegative 
integer m, where K[X] is the set of all algebraic functions having no pole except 
at the point Q, and the functions (j>i, 1 < i < N, constitute its basis. The 
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ring K[X] is a iiT-linear space denoted as K[S] which is spanned by a set of 
functions {xP{:= ni=i'/'DIP = {pi)i<i<N G S}, where 17 is a subset of the 
direct product Zq of the set Zq of nonnegative integers. Specifically, such a 
code is defined as C := {c = {cj)i<j<n G itT”| / G L{mQ)} 

for a subset L{mQ), which consists of algebraic functions / G L{ooQ){:= K[X\) 
of pole order o(/) < m, m G Zq, where V — {Pi, ■ ■ ■ , F„}. 

The set O of all possible pole orders o{xP) of functions xP corresponding 
to p G S coincides with the set of all pole orders o(/) of functions / G K[X], 
which is a semigroup with identity 0 w.r.t. addition. Its elements are numbered 
in the increasing order, i.e. O = {oi\l G Zq}, where oq = 0 < oi < 02 < • • •. 
The first nonzero pole order oi is denoted as p, and a function x G [T"] having 
o{x) = pis fixed. Then, the set K\S] becomes a iG[a;]-module over the univariate 
polynomial ring K[x\ provided x is regarded as an independent variable. \x = x^ 
for a certain G S.] A basis t < p} of the semigroup O is given by 

:= min{o/ G 0\oi = i — 1 mod p}, 1 < i < p, in particular = 0. [The 
semigroup S (w.r.t. vector addition) called a cylinder is identical with the union 

+ kb^\k G Zo} for a p-tuple b*, 1 < i < p, such that o{x^ ) = 

1 < i < p.] 

In the context of decoding, an error syndrome array m = (m;), / G Zq, is 
introduced. [On the cylinder S, the syndrome array is given as m = (up), p G A, 
where up = ui for o{xP) = o;.] Furthermore, a syndrome array vector u = 

1 < t < p, accompanied with p component arrays (j,k) G 

aW, is considered, where is a subset of the direct product P x Zq for 
P := {l,---,p} C Zq. [The array vector u corresponds to a syndrome array 
u = (up), p G 2P, defined on the double cylinder 2S := {p + q\p, q G A} C Z^, 
which is represented as an array vector u = (u®), 1 < t < p, with component 
array m® = (up), p G A in [4], such that Up = p G A, In fact, = 

, hP 1 h°i= »,o)-] As K[S] is a residue class ring modulo an ideal I, the 

array components having the same value of k) := o^®) + + kp G O 

are pairwise dependent through AT-linear dependence of functions xP, p G 2S, 
having the same pole order o{xP) = oi modulo the linear subspace {x*i\o{x^) < 
oi, q G S). Thus, we introduce a pair of mappings Zq x P — > P and 

k{ 1, i): ZqxP — > Zq defined by j = r]{l, i), k = k{ 1, i) if and only if there exists a 
pair (j, k) such that o; = d(®)(j, k), i.e. (j, k) G A^®^; otherwise we denote q{l, i) = 
= 0, where = (o; — z+1 mod p) + l, and k{1,i) = (o; — o^®^ — o®'^*’®)^)/p 

if rj{l,i), n{l,i) yf 0. In fact, we know only the values from a given received 
word such that o^®^ {j, k) < m. 

Example 1 Throughout this paper we use the following code accompanied with 
a set of instance data to illustrate our method. We consider one-point codes C 
(of codelength n= 64) over K = F 2 ^ from the Hermitian curve X: -h V^Z -h 

VZ^ = 0 having genus g = 6. For the point Q = (0 : 1 : 0), the linear space 
L{ooQ) is spanned by {a;®p^ |0 < i, 0 < j < 3}, where the functions x := ZjX 
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and y := YjX have pole order o(x) = 4 and o(y) = 5, respectively. Thus, the 
cylinder S = {{i,j) G Zg|j < 3} one-to-one corresponds to the semigroup of pole 
orders 



O = {o{Yy^)\{i,j) gS} = {0, 4, 5, 8, 9, 10, 12, 13, 14, 15, 16, 17, • • •}, 

where the first pole order is p = 4(= o{x)), and = 0(= o(l)),o^^^ = 5(= 
o(j/)),o^^^ = 10(= o(y^)), = 15(= o{y^)). For an Hermitian code over F 24 , 
defined by L(23Q); the values of k) := -\-kp are shown in Table 1, 

and the corresponding functions r]{l,i) and k{1,i) are shown in the lefthalf of 
Table 2, where the symbol — means the empty value 0. (All the tables are in 
Appendix.) 

The information of error locations is contained in a set of locator functions 
/ G K[S]. As any function in the module K\E] can be expressed uniquely as 

a form / = Y.'j=i with = J2k=o ^ of degree deg(fj) = Sj 

ifj,sj 0)) 1 < J < Pi it is represented by the corresponding polynomial vector 
f = (fj), 1 < j < P- The head number HN(/) (or HN(/)) and the head 
exponent HE(/) (or HE(/)) are defined from the pole order o(/), w.r.t. a p- 
tuple of prescribed weight vectors in*, 1 < f < p. In this paper, we restrict 
ourselves to the case of in* = b*, 1 < f < p. The exact error locations, which 

constitute a subset S of the set V, are obtained as the zeros of a so-called error 

locator module M{S) := {/ G K[E]\f{P) = 0, P G S}. Thus, we inquire a 
p-tuple of polynomial vectors = (/j*^)i<j<pi 1 < * < Pi which constitute 
a basis of the submodule M{E), or equivalently the corresponding polynomial 
matrix F = 1 < i,j < p, where HN(/i*^) = z, 1 < f < p, and = 

G K[x], 1 < < p. (The head exponent HE(/*'*^) is equal to 

= deg(/j*^) for a certain j, 1 < j < p, s.t. -I- sj*^p > -\- .s^^) p for 

any j', I < j' < p.) For the code defined by L(mQ), the polynomial matrix F 
can be obtained as a minimal polynomial matrix of the syndrome array vector 
up to the pole order m -I- 4p by the BMS algorithm including majority voting 
scheme. In the subsequent sections, we often dispense with the subscripts and 
superscripts, and denote /(z), /(z,j), /(z,j, fc), s(z,j), etc. instead of /j*\ 

/jfe > 

3 BMS Algorithm and Its Pipelining 

The vector- version of BMS algorithm [4] can be given in a modified form, which 
finds not only a pair of minimal and auxiliary polynomial matrices F{1) = 
[/(/, z)]i<i<p, G(f) = [fl'(/, z)]i<i<p but also the corresponding pair of discrep- 
ancy and auxiliary array vectors v{l) = {v{l,i))i<i<p, w{l) = {w{l,i))i<i<p 
iteratively at the increasing pole orders oi G O, where each (/-th) iteration is 
for the set of 3-tuples (i,j,k) having the same value d^*^(j, fc)(= o;), I G Zg. 
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The discrepancy and auxiliary array vectors v(l), w(l) are defined by a kind of 
operation by the minimal and auxiliary polynomial matrices F(l), G{1), respec- 
tively, upon the syndrome array vector u as follows. Denoting u{i,j,k) := 

1 < * < P, (j,k) e p 

v{l,i,j,k) ■■=^ ^ + k), 

fl—1 I'—l 

p t{l,i,p) 

p—1 iy—1 

where the values f{l,i,j,k) (0 < fc < s{l,i,j) ■= deg{f{l,i,j))), g{l,i,j,k) 
(0 < fc < ■= deg{g{l,i,j))) are the coefficients of polynomials 

g{l, i,j)i and the values v{l, i,j, k), w{l, i,j, k) {{j, k) G are the components 
of arrays v{l,i), w{l,i), respectively. 

The value d(/(l, z)) := v{l,i,jjk — s{l,i)) for {j,k) = (g{l,i), coincides 

with the discrepancy of the minimal polynomial vector /(/, z) w.r.t. the syndrome 
array vector u, provided the value d{f(l,i)) does not vanish, where s(l,i) = 
HE(/(l,z)). The following is a modified BMS algorithm (without majority logic 
scheme) for a syndrome array vector given up to the pole order m. 

Algorithm 

Step 1 (Initialization) I := 0; s(0,z) := 0, 1 < z < p; /(0,z) := (the z-th 
unit vector), 1 < z < p; c(0,z) := —1, 1 < z < p; g{0,i) := 0, 1 < z < p; 
v{0,i,j,k) := u{i,j,k), (j,k) G 1 < z, j < p; w{0,i,j,k) := 0; 

Step 2 (Discrepancy computation) Fn := {f{l_,i)\d{f{l,i)} 0, 1 < z < p}; 

Step 3 (Updating) for each 1 < * < P, j := p(^,*)> k := 

^ :=d{f{l,i))/d{g{l,j)); 

Case A: if ^ F^, for {j, k) G or 

f{l + l,z,j,fc) := f{l,i,j,k), v{l + l,z,j, fc) := v{l,i,j,k), 

g{l + ^,j,j,k) := g{l,j,j,k), w{l + l,j,j,k) := w{l,j,j,k)] 

Case B: if G Fm and s{l,i) >k — c{l,j), 
for (j, k) G or SQ) 

f{l + l,i,j,k) := f{l,i,j,k) - ^g{l,j,j,k- s{l,i) + k- c{l,j)), 

d f — — — 

v{l+l,i,j,k) j, k)- ^w{lj,j,k-s{l,i) + k- c{l,j)), 

ag 

9{l + ^,j,j,k) ■■= g{l,j,j,k), w{l + 1, j, j,fc) := w{l,j,j,k); 

Case C: if f{l,i) G Fn and s{l,i) <k — c{l,j), 
for (j, k) G or jjG) 

f{l + l,i,j,k) := f{l,i,j,k-k + c{l,j) + s{l,i)) ~ ^g{l,j,j,k), 

dg 

- - df - 

v{l + l,i,j,k) ■■= v{l,i,j,k- k + c{l,j) + s(/,z)) - -fw{l,j,j,k), 

dg 

g{l + ^,j,j,k) := f{l,i,j,k), w{l + l,j,j,k) := v{l,i,j,k); 
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Cace D: if G Fn and g{l,j) = 0, for {j,k) G or 

f{l + Ic.J.fc) := f{l,i,j,k-k + s{l,i) - 1), 
v{l + k) := v{l,i,j,k -k + s{l,i) - 1), 

g{l + ^,j,j,k) := f{l,i,j,k), w{l + l,j,j,k) := v{l,i,j,k); 

Step 4 (Termination check) I := / -|- 1; if o; > m then stop else go to Step 2. 

The computations of v{l + l,i,j,k) and w{l + l,j,j,k) are quite similar to 
those of f{l+l, i,j, k) and k), respectively, owing to their relationships 

so that there are involved many redundancies in their computations. However, 
these duplicate computations and the redundant structure of the discrepancy and 
auxiliary array vectors are indispensable for efficient parallel implementation of 
the BMS algorithm, in particular, in calculation of the discrepancy d{f{l,i)) 
through the array v{l, i,j, k). 

By introducing a pair of modifications of the mapping we have a 

pipelined version of the above algorithm, which can be implemented easily on 
a kind of systolic array. First, putting k{i) := — i + \) / p, 1 < i < p, we 

introduce 



k) := + kp + j - I, k) := d^"\j,k) - n{i)p, 

where 6^^'>{j,k) and 6^''\j,k) are defined only for (j,k) G P x Zq s.t. 6^''\j,k), 
k) G O. We denote the sets of those points (j, k) as and respec- 
tively. Then, it is easy to see that the mappings 

i) := {oi — — p{l, i) -I- 1)/ p, k{l, i) := k{l,i) + k{i) 

satisfy the following relationships 

d^"\j,k) = 01 k{l,i) = k,r]{l,i) = j; d^"\j,k) = oi k{l,i) = k,p{l,i) = j; 
k{1,i) = k{l,i) — K{q{l,i)) = k{l,i) — K(j]{l,i)) — n{i). 

Furthermore, the values k{l,i), k{l,i) are nondecreasing w.r.t. I G Zq, and in 
particular 

k{l,i) < k{l -I- l,f) < k{l,i) + 1, k{l,i) < k{l -1-1,1) < k{l,i) + 1. 



Example 2 (Continued) In our example of code, the functions k{l,i) and k{l,i) 
are shown in the righthalf o/ Table 2. 

Finally, we define modified polynomial vectors /(/, i) and arrays v(l, i) having 
the coefficients and components 

v{l,i,j,k) := v{l,i,j,k- n{i) - n{j) - s{l,i)), 
f{l,i,j,k) := f{l,i,j,k- k{l,i) + s{l,i)), 




308 Shojiro Sakata and Masazumi Kurihara 



where HN{f{l,i)) = i, HE{f{l,i)) = k{l,i). Then, we can show that the up- 
datings of them and their auxiliary counterparts are given as follows: 

f{l + := f{l,i,j,k + h{l,i) - h{l + l,z)) 

—g{l,j,j, k + k{l, i) — k{l + 1, i)), 0 < k < k(l + 1, i), 

:= g(l,j, j,k) or f (I, i,j,k), depending on case, 
df - 

v{l + l,i,j,k) := v{l,i,j,k) - -j-w{l,j,j,k), (j,k) >t {j,k), 

dg 

w{l + l,j,j,k) := w{l,j,j,k)orv{l,i,j,k), depending on case. 

The components v{l,i,j,k), w{l,i,j,k) of discrepancy and auxiliary arrays 
v{l,i), w{l,i) are given for (j,k) G s.t. (j, fc) >t {j,k), where the total 
order >t is defined by (j,k) >t ,k') if and only if o^^\j,k) > o^^\j',k'). It 
is important to note that they satisfy for j = r]{l,i),k = K{l,i),k = k{l,i), k = 
k{l, i) 



f{l,i,i,k) = v{l,i,j,k) = v{l,i,j,k- s(l,i)), 

where the above righthand sides are equal to the head coefficient of the poly- 
nomial vector f{l,i) and the discrepancy d{f{l,i)), respectively. Furthermore, 
both polynomial coefficients f{l,i,j,k) and g{l,i,j,k) are synchronized in up- 
dating f{l, i,j, k) to f{l + 1 , i,j, k), and similarly both v{l, i,j, k) and w{l, i, j, k) 
are synchronized in updating v{l, i, j, k) to v{l + 1 , i,j, k) as shown in the above 
formulae. These updating formulae are a generalization of the similar ones in 
parallelization of the BM algorithm [12]. 

4 Systolic Array and Scheduling 

To implement parallelization of our algorithm, we introduce a special kind of 
systolic array architecture as follows, where instead of the subset {o G 0\Q < o < 
m}, we take {/ G ZojO < I < m} containing gaps for the purpose of making the 
following discussions easy. (The functions g, k, etc. are redefined appropriately 
so that they are assumed to be completely defined.) 

(1) It consists of a series of m -I- 1 processors: P{0),- ■ ■ , P{m), where the l- 
th processor P{1) is connected to the (/ — l)-th (left-neighboring) and the 
(Z-l-l)-th (right-neighboring) processors P(l — 1 ) andP(Z-l-l), 1 < I < m — 1, 
except for the 0-th (leftmost) and the m-th (rightmost) ones. There are g 
trivial (dummy) processors working only as delayers as well as to -I- 1 — 5 
effective processors by which the BMS computations are executed. 

(2) The leftmost processor P(0) receives as input data the components of a 
syndrome array vector, and the rightmost processor P(m) outputs the com- 
ponents of the minimal polynomial matrix, from which one can get the de- 
sired basis of the error locator module. The Fth processor P{1), I < I < m, 
receives as input the components of the modified arrays — w{l — 1 , z) 
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and the coefficients of the polynomial vectors /(/ — 1, i), g{l — 1, i) from the 
{I — l)-th processor P{1 — 1) and (if it is effective) calculates the components 
of the modified arrays v{l,i), w{l,i) and the coefficients of the polynomial 
vectors f{l,i), g{l,i) together with the accompanying control data such as 
s{l,i), d{f{l,i)), d{g{l,i)), I < i < p. Each computation of values 

v{l, i,j, k), etc. is a combination of one multiplication and one addition over 
the symbol field K. 

(3) Each processor P{1), 0 < I < m, has p subprocessors S{l,i), I < i < p, 

which contain p cells 1 < j < p. In total, it consists of p^ cells 

C{l,i,j), 1 < i,j < p. For k G Zq, cell C{l,i,j) manipulates the four values 
v{l,i,j,_k), w{l,j,j,k), f{l,i,j,k), and g{l,j,j,k) at a certain clock n G Zq, 
where j = rj{l,i) and n depends on l,i,j,k as well as v,w,f,g. 

(4) We assume an artificial 2-dimensional arrangement of p^ cells 

^ ^ P- (We disregard realizability of such an effective structure by the 

current VLSI technology.) Consequently, our systolic array architecture has 
a three-dimensional structure having three perpendicular axes corresponding 
to the indices The arrangement of p^ cells in each processor P{1) is 

determined so that each cell C{l,i,j) is situated at the point of coordinates 
(?, (/>)"^(i), j), 0 < / < m, 1 < i,j < p, where the mapping is the inverseof 
the following permutation (j)i of the integers 1, 2, • • • , p defined by induction 
w.r.t. 1. 

(Base) 

r 1, if fc = 1 

4>o{k) '■= \ p— if fc=odd & fc yf 1 
[ I -1- 1, if fc=even 

(Induction) for I < / < m 

{ (f>i-i{p), if fc = p & k + l=even 
4>i-i{k + l),\i k ^ p k, k + l=even 
</>i_i(I), iik=lkk + l=odd 
4>i-ilk — 1), if k I k k + l=odd 

(5) The main data such as components of discrepancy arrays and coefficients 
of polynomials, i.e. elements of the symbol field K, are transmitted from the 
cells in the processor P{1) to the cells in the processor P{1 + 1), 0 < I < m—1. 
More precisely, these communications are made through the links which 
connect each cell C{l,i,j), 1 < i,j < p, in P{1) to two cells C{1 — l,i,j) and 
in the left-neighboring processor P{1 — 1) and also to two cells 
C{1 + and C{1 + l,z+,j) in the right-neighboring processor P{1 + 1), 

where the numbers are determined uniquely by the condition that 

ri{l, i) = rj{l — 1, i“) = ri{l + 1, i~^). These links are only between nearest cells 
of neighboring processors (as shown in Lemma 1 below) . 

(6) By the processor P{1), for {j, k) = i)), (if it is effective) the value 

v{l,i,j,k) is tested for 0 or not at cell If d{f{l,i)) = v{l,i,j,k) 

turns out to be not equal to zero, a flag Pi is set to be 1 at p^ cells C{1, i,j), 
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1 < *7 j < P, in the subprocessor S{l,i), and kept onwards (Otherwise JFi := 
0). At cell C{l,i,j) with the flag T\ = \ all the data v{l,i,j,k), w{l,j,j,k), 
f{l,i,j,k), g{l,j,j,k) are processed for updating, so that the new values 
v{l + l,i,j,k) and f{l + l,i,j,k) are obtained at cell C{1 + and the 

new values w{l+l,j,j, k) and g{l+l,j,j, k) are obtained at cell C{1+1, j), 

respectively. Provided that iFi = 1 at cell C{l,i,j), another flag IF 2 is set to 
be, e.g. 0, 1 or —1 (for controlling the updating) according to cases B, C or 
D. 

(7) The clock for synchronization and all the control data (flags iFi, the in- 
tegers s{l,i),c{l,j), etc.) are communicated between subprocessors or main- 
tained in subprocessors. In particular, we assume that our architecture is 
equipped with a global clock to maintain synchronization of the operations 
of all the cells C{l,i,j), 0 < I < m, 1 < i,j < p. Except for a global link 
for the clock information exchange among all the processors and the control 
data exchange among cells of each processor, there is nothing but local 
links between nearest cells of the neighboring processors as shown below. 

(8) Timing of the above-mentioned data transmission and computation is ad- 

justed such that manipulations of the data v{l, i,j, k), etc. by cells of the 
processor P{1) are done by one or two clocks later than manipulations of the 
corresponding data v{l — fc), etc. having the same index k by p^ cells 

of the left-neighboring processor P{1 — 1), 1 < I < r. 

The links between cells C{1 — ^ < i,j < p, of processor P{1 — 1) and 

cells I < i,j < p, of processor P{1) are local in the sense that the 

following conditions are satisfied: 



(a) The differences between positions <(';l\(*) and </>f^(t) are not greater 
than 1, 1 < i < p; 

(b) For tpi{k) := r]{l,<pi{k)) = {I — <l>i{k) + 1 mod p) -1-1, the difference 
between values ^/’(l\(t) and ^;~^(f) is not greater than 1, 1 < f < p, 

in view of the following lemma, which can be proved by induction. (Remark: It 
holds that (j)i{k) + tpi{k) = 1 + 2 mod p, l<A:<p, 0<^< m.) 

Lemma 1 In case of I + k = odd, 

4>i{k) = 4>i+i{k - 1), 2 < fc < p; 4n{l) = 4>i+i{l)-, 
ifi{k) = ifi+i{k+ 1), 1 < fc < p - 1; ipi{p) = ipi+i{p). 

In case of I + k = even, 

4>i{k) = (fi+iik -k 1), 1 < A: < p - 1; (flip) = (fi+i{p)\ 

-ifi{k) = in+i{k - 1), 2 < fc < p; ifiil) = ipi+i{l). 

Finally, we can give a complete scheduling of the BMS computations on the 
above architecture, where a scheduling is a mapping from the set of all operations 
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(calculations and other manipulations of relevant data) of the algorithm into the 
direct product NxC of the set N{:= Zq) of clocks and the set C := {C {I, i,j)\0 < 
I < m, 1 < i, j < p} of all cells. A possible scheduling is given by the mappings 
nv,riw,nf,ng which assign the data j, k),g(l,i, j, k) 

to the clocks 

i,j, k) := k{l, i) + I + k, nw{l, i,j, k) := k{l, i) + 1 + k, 

Uf{l, i, j, k) := I + k, Ug{l, i, j, k) := I + k, 

respectively, of the cell It means that the value v{l,i,j,k) is manipu- 

lated, i.e. calculated and/or stored to be sent to the right-neighboring nearest 
cell(s) C{1 + (and C(^ -I- 1, j, j)) at clock n = riv(l,i,j,k), and so on. (Re- 

mark: For k := k{l,i), computation of the head coefficient f{l,i,i,k) of f{l,i) 
is done just at clock n = I + k.) In view of the properties k{l,i) < k{l -I- l,i) < 
k{l, i) -I- 1 and k{l, i) < k{l + l,i) < k{l, i) + 1, this scheduling ensures that the set 
of data {v{l,i,j,k), w{l,j,j,k), f{l,i,j,k), g{l,j,j,k)} available at cell 
at a certain clock n G N can be available at two cells C{l + l,i,j), C{l+l,i~^,j) 
successively to obtain {v{l + l,i,j,k), f{l + l,i,j,k)} at cell C{1 + and 

{w{l +l,j,j,k), g{l + l,j,j,k)} at cell C{l+l,i~^,j), at clock n+1 G N and/or 
at clock n + 2 under the requirement of local communication, provided that each 
cell has a buffer storage in addition to the register for the current data v{l,i,j,k), 
etc. 

The computations of f{l,i,j,k) and g{l,i,j,k) can start just after all the 
input data have been fed into the 0-th processor and the controlling data such 
as the flags IFi, T 2 , and the integers etc. have been fixed at each 

subprocessor S{1, i) as a result of the computations of v{l, i,j, k) and w{l, i,j, k). 
The integer values T 2 , c{l,i), etc. control the finite field computa- 

tions of f{l,i,j,k) and g{l,i,j,k) during the whole process of the algorithm. 
Consequently, the total number of clocks required is about 3m. Thus, the time 
complexity is 0{m) and the space complexity as the number of cells is 0{p^m). 
Kotter [9] gave a parallel implementation of a fast decoding algorithm of one- 
point AG codes. In his architecture, he uses a set of p feedback shift registers 
each with m storage elements. That scheme has time complexity of 0{rn?) and 
space complexity of 0{pm). Since p is usually much less than m, our scheme is 
better than his in regard to the total complexity as well as in regard to time 
complexity. For instance, in case of Hermitian codes where 0{p) = our 

time and space complexities 0{m), Omul's) can be compared with his 0{rn?), 



5 Concluding Remarks 

In this paper we have proposed a special kind of systolic array architecture 
composed of three-dimensional arrangement of processing units called cells for 
implementing fast decoding of one-point AG codes and presented a method 
of scheduling of all the operations done by cells, where each cell executes a 
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combination of multiplication and addition over the finite field (symbol field) 
by itself and transmits the result to his nearest-neighboring cells at each clock. 
This architecture not only satisfies restriction to local communication which is 
required usually for any efficient systolic array architecture, but also has linear 
time complexity. 

We omit discussions about incorporating majority logic scheme and some 
other subtle aspects for decoding into our parallel architecture. 
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Appendix 

Table 1: Values d^''\j,k) for a code. 
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Abstract. Weight distributions of convolutional codes are important 
because they permit computation of bounds on the error performance. 
In this paper, we present a novel approach to computing the complete 
weight distribution function (WDF) of a convolutional code. We compute 
the weight distribution series using the generalized Viterbi Algorithm 
(GVA) and then hnd the minimum linear recursion relation in this series 
using the shift register synthesis algorithm (SRSA). The WDF follows 
from the minimum recursion. In order to generalize the use of the SRSA 
over certain commutative rings, we prove the key result that the set of 
finite recursions forms a principal ideal. 



1 Introduction 

Weight distributions of convolutional codes are important because they can be 
used to compute error performance bounds for the code. Viterbi and Omura 
[2] derive upper bounds for the probability of word and bit errors for a linear 
code in terms of the Hamming weights of all error events. Traditionally, the 
weight distribution function is computed using signal flow graph analysis and 
Mason’s gain formula [3]. Other methods have been presented by Fitzpatrick and 
Norton [5], Onyszchuk [6], and McEliece [7]. In [8], we presented an algorithm 
to compute univariate weight distribution functions. This paper extends the 
technique to compute multivariate weight distribution functions. Our method is 
to use the Viterbi algorithm to generate a recursive array and use the Berlekamp- 
Massey algorithm (BMA) to find the minimum recursion in the array. Massey 
showed [1] that the BMA synthesizes the minimum length shift register capable 
of generating the recursive array and so we will refer to it as the shift register 
synthesis algorithm (SRSA). The weight distribution function follows easily from 
the minimum recursion. 

We introduce the key steps of the problem by an example. We will compute 
the weight distribution function of the code [1 + 1? + 1 + D"^], whose encoder 

and state diagram are shown in Fig. 1. 

We are interested in enumerating the error events for the code, i.e. those 
paths in the state diagram which deviate from the zero-state and remerge with 
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Fig. 1. Encoder and State Diagram for code with G{D) = [1 + D + 1 + D^]. 



the zero-state exactly once. It can be seen by inspection that the free distance of 
this code is 5 and that there is exactly 1 error event of output weight 5. There 
are 2 events of weight 6, 4 of weight 7, 8 of weight 8, and so on. The number 
of events forms the sequence S = {1, 2, 4, 8, • • • }. By inspection, the sequence 
satisfies the recursion: S'g = 1,S'„ = 2Sn-i for n > 1. Associating the number 
of events of weight n with the coefficient of X", we get the weight distribution 
series for the code as 



X® + 2X® -b 4X^ -b 8X® -b • • • = X^(l -b 2X -b 4X^ -b • • • ) = X® ^ SiX\ 

z=0 



To get the weight distribution function, consider the sum 

oo oo oo oo 

= ^0 + ^ 25,_iX* = ,So + 2X ^ = ^0 + 2X ^ S,X\ 

2=0 2=1 2=1 2=0 



Substituting S'o = 1 and solving, we get the output weight distribution function 



x^ S,X^ 

i=0 



X^ 

1 - 2X' 



In the simple example above, we were able to easily generate the recursive 
series and identify the recursion. It is not so easy in general. The next section 
defines weight distribution functions. Section 3 explains some of the theory of 
linear recursion relations and proves the key result that the set of finite recursions 
is a principal ideal. Section 4 indicates how to generalize shift register synthesis 
to certain rings. Section 5 describes the generalized Viterbi Algorithm (GVA). 
Sections 6 and 7 describe how to generate the weight distribution series and 
compute the weight distribution function respectively. The final section makes 
some concluding remarks. 
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2 Weight Distributions 

We first introduce some notation. Usually, F will denote a field and R a commu- 
tative ring with identity. R[Xi, • • • , X„] and • • • , X„]] are respectively the 

polynomial and power series rings in n variables. We denote the set of integers 
as Z and the set of rational numbers as Q. 

An error event is a nonzero finite weight codeword for which the encoder 
starts in the zero state, departs, and returns to the zero state exactly once. In 
addition, we will call two error events equal if they are shifts of each other. The 
weight distribution of a convolutional code is defined as the weight distribution 
of the set of error events. 

The output weight distribution series (OWDS) of a code is a power series in 
Z[[A]], where the coefficient of A", a„, indicates the number of error events of 
weight n. A recursion relation can be found in this series and it can expressed 
as a rational function, called the output weight distribution function (OWDF): 

T{X) = a\ X^ + 02 X'^ + • • • + o„ A" + • • • = ^ | , (1) 

where A(A), 17(A) are polynomials over Z with deg 17(A) < deg A(A). 

The input-output weight distribution series (lOWDS) is a multivariate power 
series in Z[[A, A]], where the coefficient of A"A^, Qn,k, indicates the number of 
error events of weight n which correspond to input sequences of weight k. The 
rational function representation is called the input- output weight distribution 
function (lOWDF): 

= = ( 2 ) 

Although we assumed the existence of an encoder for the convolutional code, 
the OWDS and OWDF are independent of the encoder and depend only on the 
structure of the code. Since the lOWDS and lOWDF indicate the input sequence 
associated with the output codeword, they do depend on the choice of encoder. 

Traditionally, the WDF is computed by signal flow graph analysis and Ma- 
son’s gain formula [3]. From [3], we see that the denominator of the WDF is a 
polynomial with constant term one. In the following section, we will see that this 
implies that the sequence of coefficients of the WDS is a shift register sequence. 

3 Recursion Relations 

Let i? be a ring. Let /(A) = element of R[X], where fi may 

be equal to zero for any i (e.g. /(A) = 1 — A — A^ — OA^). The sequence 
{Si : f = 0, 1, 2, . . . , } with entries from R is said to satisfy the linear recursion 
relation specified by / if 

d 

fiSr-i = 0 y r > d. 

i=0 



( 3 ) 
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Fig. 2. Shift Register defined by f{X) = 1 + 



We also call / a recursion for the sequence {Si}. Define a polynomial with 
constant term one to be motonic. If / G R[X] is motonic and {Si} satisfies the 
recursion specified by /, then (3) implies 



d 

Sr = J2-f^Sr-^ yr>d. 
2=1 



( 4 ) 



This means {S'i} is a shift register sequence and it can be generated by the shift 
register shown in Fig. 2. 

We can also think of the sequence {Si : i = 0,1,2, .. . , } as an element of the 
power series ring by associating with it the series S'(X) = 

series S{X) is often called the generating function associated with the sequence 
{5'i}. The generating function has a nice form when S{X) comes from a shift reg- 
ister sequence. Suppose the sequence {Si} is a shift register sequence which sat- 
isfies the recursion specified by the motonic polynomial f{X) = 1-1- fi^^- 

Then by a similar derivation as in [9] , we can write 



^ S,X^ 

2=0 



E d—1 p Y~i i — 1 

i^O Ji^ Z^fc=0 



SkX>^ 



m) 



( 5 ) 



Note that the numerator depends on the d initial states of the shift register. The 
denominator, f{X), specifies the recursion and is independent of the initial con- 
ditions. If f{X) is the minimum degree polynomial which specifies the recursion, 
it is called the characteristic function of the series. 

Suppose S G R[[Xi, . . . , Xn]] and there exist / and g in R[Xi, . . . , such 
that S = j. Then / is called a, finite recursion. We make the distinction between 
an arbitrary recursion and a finite recursion because, unlike one dimensional re- 
cursions, not all multidimensional recursions are finite. The next theorem shows 
that for certain commutative rings, the set of finite recursions forms a singly 
generated ideal in R[Xi, . . . , Xn]. 
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Theorem 1. Let R be a eommutative Noetherian Unique Factorization Domain 
(UFD). Let K = R[Xi, . . .,Xn], and L = R[[Xi, . . . ,X„]]. Let S G L and let 

L = {h G K : Sh = g for some g G K}. 



Then L is a principal (singly generated) ideal of K. 



Proof. The basic idea behind the proof is a follows. It is clear that / is an ideal 
of K. Next we show that any two elements of the ideal are multiples of some 
element in the ideal. By the Hilbert Basis Theorem, K is finitely generated. The 
result then follows by collapsing the finite basis to a single generator. 

Since 05 = 0, / yf 0. 

Suppose hi,h 2 G I, a G K, Shi = 9i, and Sh 2 = g 2 - 
Then S{hi + / 12 ) = Shi + Sh 2 = gi + g 2 G K. 

Also S{hia) = {Shi)a = gia G K. So I is an ideal of K. 

Next, let di = gcd{hi,gi) and ^2 = gcd(/i 2 , ff 2 )- 
Then 3 , / 12 ) > 52 G ^ such that 

Shi = 9i\. Sdih'i = dig'i \ Sh[ = g[ 

Sh2 = 92 I Sd2h'2 = d252 i Sh'^ = g' 

where gcd{h(,g[) = 1 and gcd(/i 2 , 52 ) = 1. 

Note that the above equations imply that h( ,h '2 G I. 

Now, let d = gcd(/i'i, . Then 3 h'(, h') G K such that 
Sh'i = 9'i \ Sdh'l = g[ 

Sh '2 = 52 j Sdh'f = g '2 

Recall that since i? is a UFD, K is & UFD. 
gcd{h(,g[) = 1 \ gcd{h'l, 9 'i) = 1 1 h'l\h') 



g'lh'f = 9'2h'l 



gcd{h'2,92) = 1 / gcd{h'f,9'2) = 1 

So h'( = uh') where u G K is a unit. 



h'f\h'( 



Then hf = uh') 



= = ^hereh'2 

/I2 — (X2^2 



G /. 



Finally, recall that since R is Noetherian, the Hilbert Basis Theorem implies K 
is Noetherian. So there exists a finite generating set for L which can be collapsed 
into a single generator by the above argument. So / is a principal ideal of K. □ 



4 Generalized Shift Register Synthesis 

The Shift Register Synthesis Algorithm (SRSA), as described in [1], takes an 
input sequence, whose elements come from a field F, and generates the shift 
register of minimum length, L, which generates the sequence. The characteristic 
function of the shift register is a polynomial with coefficients from F, degree 
< L and constant term one. In other words, the SRSA finds the minimum 
degree motonic polynomial in F[X\ which specifies the recursion. Recalling that 
the set of recursions forms a principal ideal in F[X], it is easy to see that the 
SRSA solution is also the generator of the ideal. 

We want to generalize the SRSA and apply it to a sequence whose elements lie 
in a ring. For example, if we input an integer sequence, which has an underlying 
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motonic integer recursion, it turns out that the SRSA yields that motonic integer 
recursion. In this section, we show why. Recall that while Z is not a field, it can 
be embedded in its field of fractions, Q. In this case, we can consider the integer 
sequence as a rational sequence, and so the SRSA should yield the minimum 
motonic rational recursion. So we need to show that the minimum motonic 
rational recursion lies in Z[A]. 

Let 5 G Z[[A]] C Q[[X]]. Let I = {h € Z[X] : hS G Z[X]} and J = {h € 
Q[A] : hS G Q[A]}. By Theorem 1, / and J are singly generated ideals. Let 
I = (/) and J = (g) for some f € I and g G J. Since Z[X] C Q[A],/ C J. 
Figure 3 depicts the various inclusions. 




Fig. 3. Z[XJ CQ[XJ,ICJ. 



A simple argument shows that any generator of / also generates J: 

■J = { 9 ) ^ J = {ug) for any unit u G Q[Ai]. By clearing the denominators, it is 
possible to choose u such that ug G Z[A]. So ugS G Z[A] which implies ug G I. 
This means ug is a Z[A]-multiple of / and so J = (/). 

Solution via Mason’s gain formula [3] means there exists a motonic polyno- 
mial in I. This implies that there exists a motonic generator of I, which, by 
the preceding argument, generates J . In a singly generated ideal, generators are 
unique up to multiplication by a unit, and so the motonic generator, if it exists, 
is unique. Thus the motonic generator of J lies in Z[X] and this is precisely what 
the SRSA computes. 

The discussion above is easily generalized by replacing Z by a Noetherian 
UFD R and Q by the field of fractions of R. 

5 The Generalized Viterbi Algorithm (GVA) 

A trellis for a convolutional encoder is an extension in time of the encoder’s 
state diagram. Thus a trellis is a directed graph consisting of vertices, or states, 
interconnected by edges, or branches. In a recent paper, McEliece [4] presented 
a generalized version of the Viterbi Algorithm which operates on a trellis in 
which the branch labels come from a semiring. Examples of semirings include 
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Fig. 4. Trellis for code with G{D) = [1 + D + 1 + D^]. 



the non-positive real numbers and the familiar ring of polynomials, Z[X]. Figure 
4 depicts a trellis section of an encoder for the convolutional code which has a 
generator matrix given by G{D) = [1 H- D -f 1 -|- 

As previously mentioned, the branch labels come from the semiring. A path is 
a sequence of branches and the path metric of a path P is the product ( ) of the 
labels of the branches in the path, taken in order. The flow between two states. 
Si and S 2 , is defined as the sum of the path metrics of all paths starting 

at Si and ending at S' 2 . For example, when we use the semiring of polynomials 
over Z described above, the flow between two states is the weight enumerator 
polynomial that counts the number of paths of each possible weight between the 
states. In this context, the generalized Viterbi Algorithm is an efficient algorithm 
for computing the flow between two states [4] . 

6 Using the GVA to Compute the WDS 

We can use the GVA to enumerate the paths which diverge immediately from 
the zero state and remerge with the zero state exactly once. Let the semiring 
be Z[A] with the usual polynomial multiplication and addition. We label each 
branch in the trellis with a monomial X", where the exponent n corresponds 
to the Hamming weight of the output associated with the branch. Now all error 
events correspond to paths which leave and end with the zero state once. Now 
consider a path, pi, which stays at the zero state for a finite time, then diverges 
and remerges with the zero state. Then pi corresponds to the same error event 
as the path, p 2 , which immediately diverges from the zero state, imitates pi and 
remerges with the zero state. We can modify the original trellis in the following 
manner to prevent overcounting paths corresponding to the same error event. 
In the original trellis, remove all nonzero branches which diverge from the zero 
state except the initial one. In addition we break the self-loop at the zero state 
by removing the initial branch connecting the zero states. Note that a branch 
can be effectively removed by setting the branch label to zero. Figure 5 shows 
the modified trellis for the trellis of Fig. 4. 

Now we apply the GVA to the modified trellis, setting the initial flow to 1 for 
the zero state and 0 for all nonzero states. After each iteration, the flow at the 
zero state consists of terms in the WDS. It will soon be clear that the lower order 
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Fig. 5. Modified Trellis for code with G{D) = [! + £) + 1 + D'^]. 



terms will stabilize after a certain number of iterations. So after every iteration 
the stabilized terms form a truncated version of the WDS. 

Let us apply the GVA to the convolutional code of Fig. 5. Table 1 shows 
the computations. After seven and eight iterations of the GVA, we see that the 
partial WDS are x® + 2x® + 4x^ + 4x® + x® and x^ + 2x® + dx”^ + 7x® + 5x® + x^° 
respectively. Note that the first three terms remain unchanged after the seventh 
iteration. From Fig. 5, we see that changes to the flow at the 00 state depend 
on the flow at all other states. After the seventh iteration, the minimum degree 
term at the 10 state is 3x®. Since any path connecting 10 to 00 has path metric 
at least x^, it can affect only the coefficients of the terms of degree eight and 
higher. The minimum degree term at 01 is x®. Since any path from 01 to 00 
has path metric at least x^, it also can affect only the coefficients of the terms 
of degree eight and higher. A similar argument for 11 proves that the terms of 
degree seven and lower in the WDS will remain unchanged after every iteration 
thereafter. 

The above paragraphs describe how to compute the OWDS. To compute the 
lOWDS, we simply label each branch with a monomial from Z[A, Y], i.e. 
where n and k correspond to the Hamming weights of the output and input 
respectively associated with the branch. 



Table 1. GVA Gomputations for G{D) = [1 + D + 1 + D^]. 



State Iteration 





0 


1 


2 


3 


4 


5 


6 


o 

o 


1 


0 


0 




+ x° 


X 


® + 2x® 


Fx' 


x^ + 2x^ + 3x' 


Fx® 


01 


0 




0 








x*+: 


r® 


2x® F X® 




10 


0 


0 




x^ 


x'^ + X® 




2x® ■ 


F 


X® 


X® F 3x® F 


x" 


11 


0 


0 


x^ 


x^ 


x"' + X® 




2x® ■ 


F 


X® 


X® F 3x® F 


x" 




7 


1 8 


o 

o 




+ 


2x'= 


+ 


4x ' + 4x“ 




X 


® F 2x® F 4x" F 7x® 


F 5x« F x^® 


01 






X® 


+ 


3x® + x^ 








3x® F 4x'^ F 


X® 


10 






3x' 




4x'^ + X 


8 






X® F 6x^ F 5x® 


Fx® 


11 






3x' 




4x'^ + X 


8 






X® F 6x'^ F 5x® 


Fx® 
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7 Using the SRSA to Compute the WDF 

We have seen that the stable terms of the GVA computations yield a truncated 
version of the WDS. From this partial WDS, we want to find the recursion 
relation and thus compute the WDF whose series expansion yields the entire 
WDS. It can be shown that this is equivalent to solving the Key Equation. 
The Euclidean Algorithm can be used to solve the Key Equation but the SRSA 
provides an algorithmically appealing alternative. The SRSA synthesizes the 
minimum length linear feedback shift register (LFSR) which generates the WDS 
coefficients. Theorem 1 in [1] allows one to compute an upper bound on the 
number, M, of required stable terms of the WDS based on a bound on the X- 
degree, D, of the denominator of the WDF. Letting A be the weighted adjacency 
matrix of the state diagram of the code, it can be shown that 

D < max{deg(J — A)ij} and M < 2D. 

I 

We assume we have a non-catastrophic encoder (reasonable since catastrophic 
encoders are not very interesting) . A non-catastrophic encoder for the code im- 
plies that the state diagram has no zero weight loops except for the self-loop at 
the zero state. This means there are a finite number of error events with a given 
Hamming weight. 

Let us first compute the OWDF (as in [8]). The non-catastrophic encoder 
assures that the OWDS, <S'(W), is an element of Z[[A]]. Theorem 1 implies that 
the set of recursions is a principal ideal and the SRSA can be used to find the 
generator of that ideal, which is the minimum motonic recursion in Z[A]. 

Now let us compute the lOWDF. It is clear that the lOWDS, S'(A, F) = 

j SijX^Y^ , is an element of Z[[A, F]]. The lOWDS can also be represented 
by an infinite two dimensional array over Z, and computing the lOWDF is 
equivalent to finding the “minimum” two-dimensional recursion in this array. 
Although it is not clear what “minimum” means for recursions in multiple vari- 
ables, Theorem 1 implies that the set of finite recursions is a principal ideal. The 
“minimum” recursion we are searching for is the generator of that ideal. The non- 
catastrophic encoder implies that S{X,Y) is actually an element of Z[F][[A]], 
i.e. it is a univariate series with polynomial coefficients. In other words, the 
lOWDS can be represented as a sequence of polynomials in F. The SRSA can 
then be used to compute the minimum motonic recursion in this sequence. 

The algorithm used to compute the recursion relation using the GVA and 
SRSA is described in pseudocode below. 

Algorithm 1: 

label and modify trellis as described 
while (number of stable terms < bound) 
run one GVA iteration 
if (one more stable term) 

call shift register synthesis algorithm 
end if 
end while 
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8 Conclusion 

In this paper, we presented a novel method to compute the WDF of a convolu- 
tional code. We used the generalized Viterbi Algorithm to compute a recursive 
series and shift register synthesis to find the minimum recursion in this series. 
Areas of further research include issues related to the implementation of the 
algorithm. Since the algorithm requires symbolic computation in multivariate 
polynomial rings, an efficient implementation is needed to compute the weight 
distribution of large constraint length convolutional codes. 

We conclude by giving the OWDF for the best rate 1/2, constraint length 
ly = 6 convolutional code, which has generator G = [133 171]. The coefficients 
of the numerator 12(A) and the denominator A(A) are shown below: 

12 = [0 0 0 0 0 0 0 0 0 0 11 0 -6 0 -25 0 1 0 93 0 -15 0 -176 

0 -76 0 243 0 417 0 -228 0 -1156 0 -49 0 2795 0 611 0 -5841 0 

-1094 0 9575 0 1097 0 -11900 0 -678 0 11218 0 235 0 -8068 0 -18 
0 4429 0 -20 0 -1838 0 8 0 562 0 -1 0 -120 0 0 0 16 0 0 0 -1] 

A = [1 0 -4 0 -6 0 -30 0 40 0 85 0 -81 0 -345 0 262 0 844 0 

-403 0 -1601 0 267 0 2509 0 389 0 -3064 0 -2751 0 2807 0 8344 0 

-1960 0 -16133 0 1184 0 21746 0 -782 0 -21403 0 561 0 15763 0 -331 

0 -8766 0 131 0 3662 0 -30 0 -1123 0 3 0 240 0 0 0 -32 0 0 0 2] 
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Abstract. A recursive convolutional encoder can be regarded as an in- 
finite impulse response system over the Galois Eield of order 2. Eirst, in 
this paper, we introduce finite response input sequences for recursive con- 
volutional codes that give finite weight output sequences. In practice, we 
often need to describe the finite response sequence with a certain Ham- 
ming weight. Then, different properties of finite response input sequences 
are presented. It is shown that all finite response input sequences with a 
certain Hamming weight can be obtained in closed-form expressions from 
the so-called basic sequences. These basic sequences are presented for im- 
portant recursive convolutional encoders and some possible applications 
are given. 



1 Introduction 

Recursive convolutional codes have seldom been employed in the past because 
their weight enumerating function is equivalent to that of the non recursive 
convolutional codes [1]. But they have been renewed since they have been used 
to construct serial and parallel concatenated convolutional codes (turbo codes) 
whose performances are near Shannon limit (see [2] and [3]). 

The works of Battail et al. [4] have shown that recursive convolutional codes 
mimic random coding if the denominator polynomial is chosen as a primitive 
polynomial. In comparison with non recursive convolutional codes, the input 
sequences with finite weight are associated with output sequences with infinite 
weight, except for a fraction of finite weight input sequences which generate 
finite weight output sequences. These input sequences are called finite response 
input sequences (FRISs). 

In [5] , FRISs have been introduced ; the enumeration of FRISs for a Hamming 
weight w=2 is simple but however, no practical method to enumerate these 
sequences with a certain Hamming weight w greater than 2 has yet been given. 

The goal of this paper is to study the properties of finite response input 
sequences with weight w and to show how these sequences can be enumerated 
from one or more basic FRISs. 
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In the next section, we recall some classical definitions of convolutional codes. 
The third section we give different properties of FRIS and introduce basic FRIS. 
An example is given to show how these properties can be used to enumerate all 
the FRIS in closed form. Then, the basic FRISs are presented for some important 
recursive convolutional encoders. Finally, we will show how these properties can 
be used to find the Hamming weight of the output sequence of any FRIS and to 
build interleavers for turbo codes. 



2 Review of Basics 

In order to keep the following expositions self-contained, we shall introduce re- 
cursive convolutional codes and some definitions to be used later in this section. 

A rate 1 jr recursive convolutional encoder maps the input sequence of infor- 
mation bits 



■Uo, Ul,U 2 , ■ ■ ■ 

into the output sequence of r-dimensional code blocks 



yo,yi,y2,--- 



with 



Yn — (yin ; 2/2n ) • ■ ■ ) Urn ) ■ 

The encoder also goes through the internal state sequence 



S0)Sl,S2, ..., 

where each encoder state s„ at time n is a M-tuple : 

Sn = [sin, S2 n 7 ■ ■ • SMn] ■ 

M is the number of delay cells of the encoder and Sj„ is the state at time n 
of the t-th delay cell. 

The structure of a recursive systematic convolutional encoder of rate 1/2 is 
shown in Fig.l. 

A recursive encoder can also be regarded as an infinite impulse response (HR) 
system over the finite field GF(2) with input u{D) and output y{D), where D 
is the unit-delay operator: 



y{D) = u{D)G{D) 



with 



G{D) = 



( Pi{D) P2{D) Pr{D) \ 

\Q{D) ’ Q{D) Q{D)J 



( 1 ) 



and y{D) = {yi{D),y 2 {D), ...,yr{D)). 
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ym. 




Fig. 1. The structure of a recursive systematic convolutional encoder of rate 1/2. 



where Q{D) is a primitive polynomial of degree M: 

Q{D) = <7o + QiD + ... + qmD^ 
and Pi{D) is a polynomial of degree at most equal to M\ 

Pi{D) = poi + PhD + ... + pMiD^ . 

When the recursive convolutional encoder is systematic, we have yin = u„ 
since Pi{D) = Q{D). 

Since Q{D) is a primitive polynomial, the encoder generates a pseudo noise 
(PN) sequence or a maximum length sequence. The period of the PN sequence 
is 2^ — 1. The weight of the output sequence for one period of the PN sequence 
is 2^-1 [6]. 

An example of state diagram is shown in Fig. 2 for the primitive polynomial 
Q{D) = 1 + D + . Each edge is labeled by where wi and wq are 

respectively the weight of the corresponding input and output bit. As the edge 
drawn in dotted line corresponds to an input bit equal to 0, we can clearly 
observe the loop corresponding to the PN sequence of period 7 and that the 
output weight of the PN sequence is equal to 4. 

We say that the encoder with Q{D) is HR, since the weight-one input se- 
quence (impulse input) produces an infinite response, i.e. an infinite weight out- 
put sequence. 

Definition 1. A finite response input sequence (FRIS) is an input sequence 
whose first ”P’ causes the encoder state to leave the zero state So = [0,0,..., 0] 
at time no and whose last ”P’ brings it back to So at time no + L — 1 (A > 0). 

A FRIS will produce a finite weight output sequence. These FRISs are repre- 
sented by F{D). 
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Fig. 2. The state diagram for a primitive polynomial Q{D) = 1 + D + . 

3 Properties of Finite Response Inpnt Seqnences (FRIS) 

We have the following theorems about F{D). 

Theorem 1. A FRIS of a recursive convolutional encoder satisfies the equation: 

F{D) = 0 (mod Q{D)) (mod 2) . (2) 

Proof. From (1), if and only if Q{D)\u{D) (mod 2) , i. e. Q{D) is a factor of 
u{D) over the finite field GF(2), then yi{D) becomes a finite order polynomial 
or a finite weight output sequence. 

Since Q{D) is a primitive polynomial, the encoder generates a maximum 
length sequence of period 2^~^. We then have: 

DO = £)2"-1 ^ ^ 2) . (3) 

Then, (2) becomes: 

F{D) = 0 (mod Q{D)) (mod — 1) (mod 2) . (4) 

Theorem 2. If we have a FRIS F(D) of weight w noted F^'^\D): 

(5) 

where n\ = 0 and n 2 ,...,nw are any positive integer, then there exists a family 
of weight w FRISs : 

p(w) _ jymo ^jjni+mi(2^ -1) jjn 2 +m 2 { 2 ^ -1) _|_ _|_ £)n„+m„ (2'^-l) ^ 

( 6 ) 

where m\ = 0 and mo, m 2 , ■■■,mw can he any integer, positive, negative, or zero. 
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Proof. From (4), (5) and (6), we obtain: 



^ ^ 0 (7) 

(mod Q{D)) (mod — 1) (mod 2) . 

This theorem tells us that if we find any FRIS in a family, we can deduce 
all the FRISs of this family. We note that there are two different kinds of FRISs 
called simple and complex FRISs. 

Definition 2. A FRIS is simple if its last ”1” solely brings back the encoder 
state to Sq. Otherwise, the FRIS is complex since the encoder state returns to 
So more than once. 

We will now choose a unique representative for each family of simple FRISs, 
called basic FRIS. 

Definition 3. Fq^\d) is called a basic FRIS for weight w if and only if the 
following three conditions are satisfied: 



Fq™^(D)zs a FRIS with the form (5) (8) 

0 < rzi — ni-i < 2^ — I (Vi) (9) 

Uw = min . (10) 

Condition (8) means that the first ”1” of a basic FRIS should occur at time 0; 
condition (9) means that after rearranging ni,n 2 , ...n^ in ascendant form, the 
duration between two consecutive “1” should be less than 2^ — 1; condition (10) 
means that we choose as the basic FRIS the sequence with the minimal length. 
The basic FRISs of a recursive convolutional encoder depend only on Q{D). 

We call F^'^l{D) which satisfies conditions (8) and (9) a secondary basic 
FRIS F^'"\d). 

The next theorem will show how to describe all the FRISs with weight w. 

Theorem 3. Supposing w = '^iWi{wi > 1), all the FRISs can be obtained in 
the form (6) from Fq'^\d) and from combinations o/Fg™*^(D). 

In particular for w=2 and w=3, since we have no combination by w = Wi 
{wi > 1), each FRIS is obtained from basic FRISs according to (6). 

The next theorem will give us the total number of basic FRISs for each weight 

w. 

Theorem 4. For w=2, there exists only one basic FRIS: 1 + 

For w=3, there exists 



2^-2 






basic FRISs. 



( 11 ) 
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For 4 < w < 2^ — 1, there exists 




2 ) ( 2 ^ - 3)“-3 - 



A 



W 

W 



basic FRISs. 



(12) 



N-u, is the number of F^'^\D) which are constructed from secondary basic FRISs 
Fg"'\n) ; Ap is the number of ordered selections of p elements from a set of n 
elements, and [c] means c rounded to the nearest integer towards plus infinity. 

Proof. Since Q{D) is a primitive polynomial of degree M, (si|so = ^o) = Si 
when an input “1” occurs at time 0, where Si = [1,0,...0]; and then, in the 
absence of an input, s„ goes through all possible 2 ^ — 1 nonzero encoder states 
and repeats with period 2 ^ — 1; it returns to So if and only if an input “1” 
occurs and the current state is S* = [0, So, if we exclude the first “1” 

and the last “1” of this FRIS, the w — 2 other “l”s can occur under any state 
Snjs„. yf S'o,s„. yf yf Note that, for the second ”1” of the FRIS, 

— SQ- 

Therefore, there are {2^ — 2)(2^ — 3)™“^ different secondary basic FRISs 
including those that are constructed from F^‘^*'^{D); on the other hand, from (6) 
each family includes A™ secondary basic FRISs if rn — rii-i yf rij — nj-i{i yf j) 
and possibly less than otherwise. As a result, we conclude that there exist 
(((2^ - 2)(2^ - 3)“-3 - iV^)/A“] basic FRISs. 

For w = 2, there is only one basic FRIS which has the first “1” corresponding 
to the leaving of the zero state to S'! and the other “1” for the return from 
& 2 m_i = S* to the zero state S'o, that is, F^^\d) = 1 + D'^’^ . 

For w = 3, Ns = 0, then there are \{2^ — 2)7^3] basic FRISs. 



Example 1. Supposing M=3 and Q{D) = ! + £) + . 

For w=2, since 2^ - 1 = 7, F^^\d) = 1 + D’^. 

All weight-2 FRISs can be written as follows according to (6) : 

F^lmAD) = D^°{1 + 

For ■u;=3, there exists [(2'“ — 2)/Ag] = 1 basic FRIS, = 1 + D + . 

All weight-3 FRISs can be written as follows according to (6) : 

+ 

for example, 

F^%^_i{D) = 1 + 792 + 796, 

= 1 + 794 + 796. 

For ■u;=4, since 4 = 2+2, and F^\d) = 1 + 79^, we have F^^\D) which are 
combinations of secondary basic FRISs Fg^\D) written by F^^\d): 

Fi")(79) = Fg(")(79) +79b+g(2)(^)^ ^ 1,2,..., 6. 

Clearly, here N 4 = 6, |"((2^ — 2)(2^ — 3)4“6 _ iV4)/A4] = 1 and we have 
one Fg^^(79) that is, Fg^^(79) = 1 + + D'^ . Therefore, the following two 

equations describe all simple weight-4 FRISs : 

F^'^\D) = 79™0 _|_ £)2-|-7m2 _|_ £)3+7m3 ]ji+7m4,'^ ^ 
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where mo, rrii can be any integer and li = 1, 2, 6. 

And the following equation describe all complex weight-4 FRISs: 
F^oL{D) = 

where mo, rrii can be any integer and rrii yf rrij{i yf j). 



4 Tables 

In this section, we will give a list of basic FRISs for recursive convolutional 
encoders with M=2, 3, 4 and 5. The following basic FRISs have been obtained 
from an exhaustive search since there is no known method to find them. 



Table 1. Basic FRISs for M = 2 Q{D) = 1 + D + D^. 



w 


-^0 


2 


1 + D^ 


3 


1 + D + D^ 


4 


1 + D + D^ + D'^ 



Table 2. Basic FRISs for M = 3 Q{D) = 1 + D + D^. 



w 


^(») 


2 


1 + D'^ 


3 


1 + D + D^ 


4 


1 + D^ +D^ + D^ 



Table 3. Basic FRISs for M = 4 Q{D) = 1 + D + D*. 



w 




w 




w 


^0 


2 


1 + D^^ 


4 


1 + D'^ + D‘^ + D^ 


4 


l + D^ + D^ + D^ 


3 


1 + D + D* 


4 


1 + D^ + D^ + D'^ 


4 


l + D^ + D* + D^ 


3 


1 + D^ + D^ 


4 


1 + D + 


4 


l + D* + D^ + 


3 


1 + D^ + 


4 


1 + D + D^ + 
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Table 4. Basic FRISs for M = 5 Q{D) = I + D + + D'^ + . 



w 


p(^) 


W 


p(^) 


1/; 


^0 


2 


1 + 71^1 


4 


1 + Tl® + 7)’2 + 7)’® 


4 


1 + 71® + 7)’® + 71’’’ 


3 


1 + 71^ + 71® 


4 


1 + 71 + 71® + 71” 


4 


1 + 71® + 7)’® + 71’’’ 


3 


1 + 77’’ + 71® 


4 


1 + 71® + 7)’’ + 7)’^ 


4 


1 + 71® + 71“ + 71’® 


3 


1 + D + D^^ 


4 


1 + 712 + 71® + 7)’^ 


4 


1 + 71® + 7)’2 + 71’® 


3 


1 + 71® + 71’® 


4 


1 + 71’“ + 7)’® + 71” 


4 


1 + 71'^ + 7)’® + 71’® 


3 


1 + 71'‘ + 71’’’ 


4 


1 + 71^ + 71® + 7)’® 


4 


1 + 71® + T)’’ + 71’“ 


4 


1 + 71'‘ + 71® + 71® 


4 


1 + 71“ + 7)’“ + 7)’® 


4 


1 + 71® + 71® + 71’“ 


4 


1 + 71 + 71^ + 71’' 


4 


1 + 71® + 71” + 7)’® 


4 


1 + 71’ + 7)’“ + 71’“ 


4 


1 + 71 + 71® + 71’“ 


4 


1 + 71® + 71” + 7)’® 


4 


1 + 71” + 7)’® + 71’“ 


4 


1 + 7)2 + 71’’ + 71” 


4 


1 + 71“ + 71” + 7)’® 


4 


1 + 71’“ + T)’’’ + 712“ 


4 


1 + 7)® + 71® + 71” 


4 


1 + 71’® + 7)’® + 71’® 


4 


1 + 71® + 71“ + 71““ 


4 


1 + 7)^ + 71“ + 7)’2 


4 


1 + 71 + 71“ + Tl’’’ 


4 


1 + 71® + 7)’® + Tl®’ 


4 


1 + 7)® + 71’“ + 7)’2 


4 


1 + 71’’ + 7)’2 + T)”" 


4 


1 + 71® + 7)”’ + 71“’ 


4 


1 + 7)® + 71® + 7)’® 


4 


1 + 71” + 7)’® + 71’’’ 







5 Examples of Application 

5.1 Hamming Weight of the Output Sequences of Finite Input 
Response Sequences 

In this section, we will show how to use the properties introduced above to 
compute the Hamming weight of the output sequence of any FRIS. 

Theorem 5. Consider an arbitrary FRIS of weight w F^^^D): 

F^^\D) = + + 

where ni = 0 and Ui > Ui-i, (Vi). d[F^'"\D)] denotes the Hamming weight of 
the output sequence. We have 

W 

(71)] = d[F^^^ (71)] + d[PN] (13) 

i=2 

where Fg^\D) is the secondary basic FRIS 

F^^\d) = D'l + ... + £)Ei'y (14) 

with = 0 

li = m — Ui-i — 1 (mod 2^ — 1) + 1 
, _ r(ni - Ui-i - k)] 
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d[PN] is the weight of the output sequence for one period of the PN sequence. 
d[PN] = since Q{D) is a primitive polynomial. 

This theorem tells us that we can calculate the Hamming weight of the output 
sequence of any FRIS from its associated secondary basic FRIS. We will now 
give a method to find the secondary basic FRISs from the basic FRIS. 
Consider a basic FRIS Fq^\d) 

^ + (15) 

where m = 0 , n* > From this basic FRIS, we can deduce all the 

simple FRIS of the family 

pM(^D) = £)™0 -1) ^n2+m2(2“-l) -1)'^ 

= + ... + 

where m\ = 0 , k = toq + Ui + mi(2^ — l)(Vt) . 

All the secondary basic FRISs can be obtained from the basic FRIS by permu- 
tation of ni,U 2 , ..., Uw and then searching mo, m2, ..., m^ to satisfy the inequality 
li < I 2 <■ ... < Iw and li = 0 , li — li—i < 2^ — 1. 

Example 2. Supposing M=3, w=3 and Q{D) = 1 + D + D^. There is only one 
basic FRIS : 

f^^\d) = D° + D + D^ = 1 + D + D^ . 

We have 

F^f{D) = D° + D^ + = 1 + D^ + D^ with mo=0,m2=0, m3=l . 

F^^^{D) = D-^{D^+D^ + D°+'^) = 1 + D'^+D^ with mo=-l,m2=0,m3=l . 



5.2 Interleaver Construction for Turbo Codes 

Turbo codes are a parallel concatenation of recursive systematic convolutional 
codes [2]. The turbo encoder consists of two recursive convolutional codes and 
an interleaver of size N. An example of a turbo encoder is shown in Fig. 3. 

The N bits information sequence u{D) is encoded twice : firstly by Cl and 
secondly after interleaving by C2. A tail sequence composed of M bits is added 
after the information sequence in order to bring the internal state of the first 
encoder to the zero state. As a consequence only FRISs are allowed. 

So we can use the properties of FRISs for the construction of the interleaver. 
The interleaver should improve the weight distribution and the free distance of 
turbo codes. An optimal interleaver should map the input sequences u{D) which 
generate low weight output sequences yi{D) with sequences v{D) which generate 
high weight output sequence y 2 {D) and vice versa. 

For the construction of the interleaver, we can take into account only the 
input sequences u{D) which generate low weight output sequences. These se- 
quences can be enumerated using the properties of FRISs introduced above. The 
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Fig. 3. The structure of a turbo encoder of rate 1/3. 



weight of the associated output sequence yi{D) is calculated by using (13). The 
weight of the output sequence j /2 (D) can also be obtained using a generalization 
of this principle. 

In [7], we have shown that these properties combined with a tree research 
method for construction of the interleaver can produce very good interleavers. 

6 Conclusion 

The finite response input sequences (FRISs) for a recursive convolutional encoder 
with a primitive polynomial can be defined by (4). In this paper, new practical 
properties of FRISs with a certain Hamming weight w are presented. We have 
introduced the basic FRIS and shown that we could write all FRISs with weight 
w in closed-form expressions from these basic FRISs. 

These properties can be employed in many applications, such as the com- 
puting of the weight enumerators of these codes and the construction of efficient 
interleavers for turbo codes. 
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Abstract. A group covering design (GCD) is a set of mn points in n 
disjoint groups of size m and a collection of b fe— subsets, called blocks, 
such that every pairset not contained in the same group occurs in at least 
one block. For m = 1, a GCD is a covering design [5]. Particular cases 
of GCD’s, namely transversal covers, covering arrays, Sperner systems 
etc. have been extensively studied by Poljak and Tuza [22], Sloane [24], 
Stevens et al. [26[ and others. Cohen et al. [8], [9[ and Sloane [24] have 
also shown applications of these designs to software testing, switching 
networks etc.. Determining the group covering number, the minimum 
value of 6, for given k, m and n, in general is a hard combinatorial prob- 
lem. This paper determines a lower bound for 6, analogous to Schonheim 
lower bound for covering designs [23]. It is shown that there exist two 
classes of GCD’s (Theorems 15 and 18) which meet these bound. More- 
over, a construction of a minimum GCD from a covering design meeting 
the Schonheim lower bound is given. The lower bound is further improved 
by one for three different classes of GCD’s. In addition, construction of 
group divisible designs with consecutive block sizes (Theorems 20 and 
21) using properties of GCD’s are given. 



1 Introduction 

Let K and M be sets of positive integers and let A be a positive integer. A triple 
(X,Q,B) is a group divisible design (GDD), denoted GD[K, X, M]v], if 

(i) A is a finite set of v elements (points); 

(ii) Q = {Gi,G 2 ,... ,G„},n > 1, is a partition of X with [G^j G M. The 
elements of Q are called groups; 

{in) B is a, collection of subsets, called blocks, of X such that \B\ G K and 
n G| < 1 for every B € B and G G G; 

{iv) every pairset {x, y\ C X such that x and y belong to distinct groups is 
contained in exactly A blocks. 

Observe that a GD[K,X,{l};v] is a pairwise balanced design, denoted 
PBD[K,X;v]. If AT = {k} and M = {m} then GD[K, X, M;v] is often de- 
noted by GD[k, X, m;v]. A GD[k, X, m; km] is called a transversal design and is 
denoted by TD[k, X; m], while a GD[k, A, 1; u] is known as a balanced incomplete 
block design, denoted B[k,X;v]. If A = 1 one usually represents GD[k,X,m;v], 
TD[k,X;m], respectively by GD[k,m;v], TD[k;m\. 
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A pair (X,B) is said to be a (v, k, X)— covering design, denoted AD[k, A; w], if 
\X\ = V and ,8 is a collection of /c— subsets of X such that every distinct pair of 
elements of X is contained in at least A blocks of B. Let C{k, A; v) = min{|8| : 
(X,B) is an AD[k, X;v]} is called the covering number. In [23], Schonheim has 
shown that 



C{k,X]v) > 



V 


'A(v- 1)1' 


k 


fc- 1 



a{k, A; v) 



( 1 ) 



where [x] is the least integer such that x < [x] . This has been further sharpened 
by Hanani[14] in the following theorem. 

Theorem 1. [14] Let (X,B) be a covering design AD[k,X;v]. If 
X{v — 1) = 0 (mod k—1) and Xv{v — l)/(fc— 1) = — 1 (mod k) then C{k,X;v) > 
a{k, A; v) + 1. 

In [4] , Caro and Raphael has shown that if A = 1 and the conditions of Theorem 1 
are not satisfied then equality in (1) is attained for v > vo{k). 



Theorem 2. [4] For v > vo{k), C{k,l]v) 



y_ 

k 



v — 1 
k-1 



a{k,X;v), unless 



k — 1 is even, (u — 1) = 0 (mod k — 1) and v{v — l)/{k — 1) = — 1 (mod k) in 
which case C{k, 1; v) = a(k, 1; v) + 1. 



Recently, the authors [2], Honkala [15] and Zhang [28] have used covering 
designs to improve lower bounds on binary covering codes. An attempt to extend 
the counting arguments of covering designs to study g— ary covering codes for 
q > 3 has been made by Chanduka [5] , Chen and Honkala [7] . It motivates a study 
of a new combinatorial design, called group covering design, which generalizes 
the concepts of covering designs and group divisible designs. Particular cases 
of group covering designs, namely, transversal covers, covering arrays, Sperner 
systems etc. have been extensively studied by Poljak and Tuza [22], Sloane [24], 
Stevens et al. [26] and others. Cohen et al. [8], [9] and Sloane [24] have also 
shown applications of these designs to software testing, switching networks etc. 

Definitions and basic properties of group covering design and group covering 
number are discussed in section 2 of this paper. Lower bounds for the group 
covering number are also obtained in this section. Section 3 contains a few lower 
bounds for group covering number obtained from known designs. Few construc- 
tions for a class of group covering designs are given in section 4. It is shown that 
for any k there exists two classes of CCD’s which meet the bound given by (5). 
The paper concludes by constructing some GDD’s with consecutive block sizes 
using properties of group covering designs. 

The following two theorems of Hanani [14] give necessary conditions for the 
existence of a GDD and a TD. 



Theorem 3. [14] ■’ If a group divisible design GD[k,X,m-,v] exists then 



V > km, V = 0 (mod m), X(v — m) = 0 (mod fc — 1) 



( 2 ) 
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and 



Xv{v — m) = 0 (mod — 1)). (3) 

Theorem 4. [14] : If v = km, m> 1, A = 1 and m G TD{k) then m> k — 1. 

Thus, if 1 < m < k—1 then a GD\k, m; km] does not exist. Further, Tarry [27] 
has shown that a GiA[4, 6; 24] does not exist. Therefore, the conditions of The- 
orem 3 are not sufficient for the existence of a GD[k,\,m]v\. However, group 
divisible designs with block sizes 3 and 4 are known to exist (see e.g., [3] and 
[14]) for all A,m and v satisfying the necessary conditions of Theorem 3 with 
the exceptions of GiA[4,2;8] and GI?[4, 6;24]. Very little is known for the case 
k> 5 except for some special cases, see Assaf [1]. 

If A = 1 and v > km then the following theorem gives a necessary condition 
for the existence of a group divisible design. 

Theorem 5. If there exists a group divisible design GD[k,m-,v] with v > km 
then V > k{k — 1) -I- m. 

Proof. Let {X,Q,B) be a group divisible design GD[k,m-,v] and let B G B. 
Then there exists a group G G Q such that Bf^G = (p. Let xq G G. For any 
point X G B, let Ha, be the block which contains the pairset {a;o,x}. Then 
for any two distinct points x and y of B, (Ha;\{xo}) n(^y\{^o}) = </>• For if 
yo G -Ba, f]By,yo yf xq, then {yo,xo} C Bxf]By, contradicting the fact that 
each pair is contained in exactly one block. Also (Ha;\{a;o}) p| ^ ^ for every 

X G B. Thus, the set G[J(lJ^g^ .Ba,\{xo}) contains m + k{k — 1) points and this 
number cannot exceed v. ■ 

Remark 1. If m = 1, the above theorem gives v > k(k — 1) -I- 1, the famous 
Fished s inequality for A = 1. 

The above theorem also shows nonexistence of many group divisible designs. 
For example, it is easy to verify that GD[8, 2; 30], GZ1[9, 3; 51] and GZ1[10, 3; 48] 
do not exist. 

For other elementary results on GDD and covering designs the reader is 
referred to [10]. 

2 Group Covering Designs 

It has been observed that a group divisible design may not always exist for 
certain values of v, k and m. To deal with such cases, the notion of group covering 
designs analogous to covering designs is introduced in this section. The study 
of group covering designs apart from being a generalization of covering designs 
finds applications in g-ary covering codes, q> 3 (see, [5], [6] and [7]). 

Throughout this paper, unless otherwise stated m, n and k are assumed to 
be positive integers with n > k > 2 and the members of a pairset belong to 
distinct groups. 
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Definition: A triple {X,Q,B) is a group covering design (GCD), denoted 
GC[k,m-,mn], if |A| = mn, ^ is a partition of X into m— subsets and B is 
a collection of A:— subsets of X such that |G n i?| < 1 for all G G Q and B G B 
and every pairset is contained in at least one block of B. 

In case \ G{^B \ = I iov G G Q and B G B, the group covering design 
GC[k,m;km] is called a transversal covering design, denoted TG[k-,m] (also 
referred to as transversal covers [26]). It follows immediately that an AD\k, 1; n] 
is a group covering design GG[k, 1; n] and vice versa. The number G{k, m; mn) = 
min{|,B| : {X,Q,B)is a GG[fc,m;mn]} is called the group covering number. In 
case n = k,G{k,m; km) will be denoted by tc{k;m). A trivial lower bound for 
G{k,m;mn) is given by 



G(fc, to; mn) > 



nm?{n — 1) 
fc(fc-l) ■ 



(4) 



Equality in (4) is attained if there exists a GD[k,m-,mn\. Analogous to 
Schonheim lower bound [23] for covering designs, the following theorem gives 
a better lower bound for the group covering number. 



Theorem 6. Let {X,Q,B) he a group covering design GG[k,m-,mn]. Then 

= P{k,m;mn) (say). (5) 

Proof. Let xq G X. Then the number of pairsets containing xg is m{n — 1). 
If i? G ,8 is a block containing xq then B can cover (fc — 1) of these pairsets. 
Hence the number of blocks in the group covering design containing xg is at 
least \m{n— l)/{k— 1)]. A simple counting argument gives kG{k,m;mn) > 
mn \m{n — l)/(fc — 1)] . As G{k, to; mn) is an integer, the result follows. ■ 



G(fc, to; mn) > 



mn 


m(n — 1) 


k 


A:- 1 



Remark 2. If to = 1 then G{k, 1; n) = G(fc, 1; n) > 



n 

k 



n — 1 
k-l 



known lower bound for the covering number G(k,T,n) [23]. 



which is the best 



A similar result was proved by Chen and Honkala [7] while estimating the 
minimum number of {R + 2)-weight codewords required to cover all 2-weight 
words of any q-ary covering code of length n and covering radius R. 

If a group divisible design GD[k, to; mn] exists then k, m and n must satisfy 
( 2) and ( 3) and the number of blocks is T^iis observation gives the 

following theorem. 

Theorem 7. Let k,m and n satisfy ( 2) and ( 3). Lf a group divisible design 
GD[k,m',mn\ does not exist then G{k,m;mn) > 4" 

In particular, when n = k and 1 < to < A: — 1, by Theorem 4 tc{k;m) > 
wf -I- 1. In [26], Stevens et al have shown that if to > 3 and k > m + 2 then 
tc{k; to) > TO^ -I- 3 with the only exception being Ac(5; 3) = 11. 
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Let {X,Q,B) be a group covering design GC[k,m-,mn] and let xo G X. Fol- 
lowing the arguments of Theorem 6, the number of blocks containing xo, denoted 
f{xo), satisfies 



f{xo) > 



m{n — 1) 
fc- 1 



= mo 



(say). 



Let g(xo) = f(xo) — mo- If g(xo) > 0 and m{n — 1) = 0 (mod fc — 1) then there 
exists at least one pairset {xq, y} which is contained in two or more blocks. This 
proves the following lemma. 



Lemma 1. Let {X, Q, B) be a group eovering design GC[k, m; mn] and let {x, y\ 
he a pairset. If either g{x) = 0 or g{y) = 0 and m{n — 1) = 0 (mod k — 1) then 
the pairset {x,y} is contained in exactly one block of the group covering design. 

The following theorem is analogous to the result obtained by Hanani for 
covering designs, see [14] . 

Theorem 8. Let (X,Q,B) be a group covering design GC[k,m]mn]. Ifm{n — 
1) = 0 (mod fc — 1) and nm?{n — l)/(fc — 1) = — 1 (mod k) then G{k, m; mn) > 
j3{k, m; mn) + 1. 

Proof. If possible, suppose G{k,m;mn) = (3{k,m-,mn). By hypothesis 
f3{k,m-,mn) = hence Yhx&x there exists a 

unique xo € X such that g{xo) = 1 and g{x) = 0 for x ^ xo- Hence by Lemma 
1, there exists a pairset {cco,m} which is contained in at least two blocks. Thus 
g{u) > 0 for u yf xq, a contradiction. ■ 

The above theorem gives many improvements to (5), e.g., G(7,2;26) > 16, 
G(7,2;32) > 24, G(5,4;32) > 46, G(5,4;52) > 126, G(ll,4;64) > 36, 

G(5,4; 72) > 246. 

Lemma 2. Let {X, Q, B) be a group covering design GG[k, m; mn] and let m(ji— 
1) = 0 (mod k — 1) and nwf{n — l)/(fc — 1) = —2 (mod k). If G{k, to; mn) = 
l3{k,m;mn) then there exists a unique pairset {xo,yo} Q X which is contained 
in exactly k blocks and every other pairset is contained in exactly one block. 

Proof. Note that, G{k,m;mn) = 

Thus, there exists xo & X with g{xo) > 0. Hence there exists yo & X such 
that {xo,j/o} is contained in at least two blocks. Thus g{yo) > 1. Hence g{xo) = 
ff(yo) = 1 and g{x) = 0 for all other x G X. Hence by Lemma 1, every pairset 
having xo or yo will be contained in exactly one block. Since each block contains 
( 2 ) pairsets and (3{k, to; mn) = 2 ""^^ + {k — 1), the pairset {xo,yo} 

will be contained in exactly k blocks and the result follows. ■ 



Theorem 9. Let (X,Q,B) be a group covering design GG[k,m]mn] and let 
m{n — 1) = 0 (mod k — \) and nm^{n — l)/(fc — 1) = —2 (mod k). If m{n — 
l)/(fc — 1) <fc-|-TO — 2 then G{k, to; mn) > j3{k, to; mn) + 1. 
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Proof. Suppose G{k,m-,mn) = P{k,m-,mn). By Lemma 2, there exists a 
pairset {xo, 2/o} contained in exactly k blocks of B and f{xo) = f{yo) = itio + 1- 
Let yo G G. Then for every y G G,y ^ yo, the pairset {xo, y} must be contained 
in exactly one block of B. Thus /(xq) > fc + (m — 1), a contradiction. ■ 

As a consequence of the above theorem, there are many improvements to (5), 
e.g., G(6,2;22) > 16, G(8,3;45) > 35, G(5,4;28) > 35, G(10,4;76) > 62. 

Lemma 3. Let n > k > 3 and let (X,Q,B) be a group covering design 
GG[k,m\mn]. If m{n—l) = Q (mod/c — 1), nm^(n — l)/(fc — 1) = — 3 (mod A:) 
and G{k,m;mn) = P{k,m;mn) then there exists three elements xq, yo and zq of 
X belonging to distinct groups such that each of the pairsets {xq, yo}) {yo> zo} and 
{zo,xq} is contained in |(fc+ 1) blocks of B and all other pairsets are contained 
in exactly one block. 

Proof. Observe that G{k,m;mn) = J2x&x ~ 

Thus, there exists a point xq G X with g{xo) > 0. Following the arguments 
given in Lemma 2, there exists yo G X with g{yo) > 0 and xo and yo are in 
two distinct groups. Note that m; mn) = 2""^^ + f(^ ~ 

if g{xo) = 2 and g{yo) = 1 then g{x) = 0 for all other x G X and the pairset 
{a^O)yo| will be contained in |(A: — 1) + 1 blocks of B. Hence g{xo) = y(j/o))R 
contradiction. Therefore, y(xo) = g{yo) = 1 and there exists zo G X such that 
g{zo) = 1. Hence for each x G {xq, yo, ^0}) the number of pairsets containing x ( 
counting multiplicity ) that are contained in more than one block is k — 1. By 
Lemma 1 these (fc — 1) extra pairsets containing xo must contain either yo or 
Zq. Similar statements hold for yo and zo- If xo and zo belong to the same group 
then since the pairset {xo,yo| occurs in exactly k blocks, there will be no extra 
pairset containing zo, i.e., g(zo) = 0, a contradiction. Hence xo,yo and Zq must 
belong to distinct groups, each of the pairsets {xq, yo}) {yO) -^oj and {zo,xo} must 
occur in {k + l)/2 blocks and all other pairsets occur exactly once. ■ 

The proof of the following theorem, being similar to the proof of Theorem 9 
is omitted. 

Theorem 10. Let (X,Q,B) be a group covering design GG[k,m-,mn] and let 
m{n — 1) = 0 (mod fc — 1) and nwf{n — l)/{k—l) = —3 (mod A:). If 
m{n — l)/{k — 1) < (fc — 3)/2 + to, then G{k, m; mn) > P{k,m; mn) + 1. 

The above theorem gives improvements to (5) for many values of k,m and 
n; e.g., G(7, 2; 20) > 10, G(9,2;26) > 10, G(ll,2;32) > 10, G(9,4;52) > 36. 

Remark 3. If m = 1, Lemma 2 and 3 give analogous results for covering designs. 



Theorem 11. G(5, 2; 14) > 10. 

Proof. Suppose {X, Q, B) is a group covering design GG[5, 2; 14] with 9 blocks. 
Let Q = {Gi, . . . , G7}. For simplicity, let Gj = {1, 2} meaning thereby that 1(2) 
represents the first (second) element of the group Gj. Moreover for i? G ,8 (pairset 
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P) it is convenient to write B = x\ . . .xr {P = x\ . . . X 7 ) with Xi e GiU{9}. Here 
Xj = 9 will mean B{P) does not contain any element of Gj. Since /3(5, 2; 14) = 9, 
by Lemma 3 there exists 3 pairsets, say Pi = 119... 9, P 2 = 6*1 10... 0 and 
P 3 = 1919 . . .9, that are contained in exactly 3 blocks of B and every other 
pairset is contained in exactly 1 block. Following the argument given in Lemma 3, 
it is easy to see that for each i = 1,2,3 there exists exactly 4 blocks having Xi = 1. 
Since each of the pairsets 120 ... 0, 210 ... 0 and 220 ... 0 occurs in exactly 1 
block, there exists 6 blocks Pi, ... ,Bq whose first 2 entries are 11,11,11, 12, 21 
and 22 respectively. Since each of the pairsets P 2 and P 3 occurs in exactly 3 
blocks and 0:3 = 1 for only 4 blocks, the following two possibilities arise. 

Case 1: Pi = 111 * * * * P 2 = 111 * * * * P 3 = 111 * * * * 

P 4 = 122 % % ^ % P 5 = 212 % % % % P (5 = 221 % % % % 

Case 2: Pi = 111 % % % % P 2 = 111 % % % % P 3 = 112 % % % % 

P 4 = 121 % % % ^ P 5 = 211 % % % % Pg = 22y % % % % 

where * denotes an element of {0, 1,2} and 7 yf 1. Suppose that Pi’s have the 
configuration given in Case 1. Then each pairset with a;i = l,a ;2 = X 3 = 0 
must be contained in exactly one Pi, 1 < t < 4. Without loss of generality, let 
P 5 = 212a/300 and let P = 199a999. If P is contained in Pi or P 2 or P 3 , say P*, 
then the pairset Q = 919a999 will be contained in B* and P 5 , a contradiction. 
Otherwise P is contained in P 4 and hence the pairset R = 992a999 is contained 
in P 4 and P 5 , a contradiction. Thus /3(5, 2; 14) > 9. The proof in Case 2 follows 
similarly. ■ 

3 Lower Bounds for G{k, m; mn) from Known 
Combinatorial Designs 

A relation between covering numbers and group covering numbers is given by 
the following theorem. 

Theorem 12. If n> m+1 then G{m + 1, 1; mn + 1) < n + G{m + 1, m; mn). 

Proof. Let {Y, Q,V) be a minimum group covering design GC[m+ 1, m; mn] 
and let Q = {Gi,G 2 ,... ,G„|. Let j/o ^ Y- Consider X = TlJ{j/o} and B = 
VyjV, where V' = {Gi lJ{j/o}, G 2 lj{ 0 o}, ■ • • ,G„lJ{ 0 o}}- It is easy to verify 
that {X, B) is a covering design AD[m+ 1,1; mn+ 1]. Thus G(m + 1, 1; mn + 1) < 
n + G{m + l,m;mn). ■ 

As an immediate consequence of the above theorem, we have. 

Corollary 1. Ifn>m+1 then G{m+ 1, m; mn) > (3{m + 1, m; mn) + A{mn + 
1; m + 1) where A{mn + l;m+l) = G(m+l,l; mn + 1) — o;(m + 1, 1; mn +1). 

Proof. Observe that a{m + 1, 1; mn + 1) — n = j3{m + 1, m; mn). ■ 

If A(mn + 1; m + 1) >0 then the above corollary gives an improvement to 
(5), e.g., G(4, 3; 18) > /3(4, 3; 18) + 2 = 25, and G(5, 4; 28) > /3(5, 4; 28) + 1 = 35. 

Remark 4- Let n{mn +1) = —1 (mod m+1). Then Hanani [14] has shown 
that G{m + 1, 1; mn + 1) > a{m + 1, 1; mn + 1) + 1 and hence by Corollary 1 
G{m + 1, m; mn) > /3(m + 1, m; mn) + 1. 
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Theorem 13. Let t>n>3. If there exists a transversal design TD[n]m] then 
G{n, m; m{t{n — 1) + 1)) < tm? + G(n, m{n — 1); m{n — l)i). 

Proof. Let : j = 1, ... ,t} he & collection of t transversal designs 

TD[n-, m] satisfying Xi n Xj = Go for all i ^ j and Qj = {Gq = Gji,Gj 2 , • ■ ■ , 
Gjn}.hetY = [Jji^Gji,a,ndletg = {Gi,G 2 ,... ,GJ where G* = U 2 <i<n^iJ- 
Let P be a collection of n— subsets of Y such that (Y,g,V) is a minimal group 
covering design GG[n,m{n — l);m(n — l)t]. Then {X,H,B) where X = Y U 
Go, Td = Uj Gj and B = V U BiU ■■■ U Bt is a, GG[n, m; m{{n — l)t +1)]. ■ 

If m is a prime power then a TD[m + 1; to] exists [14]. Hence we have 

Corollary 2. Let m he a prime power with m = n — 1 > 2 and t > n. If 
G{n, m{n — 1); f) = (3{n, m{n — 1); f) then G{n, to; t{n — 1) + 1) = /3{n, to; t{n — 

Proof. Note that (3{n,m{n— 1); t) = ]’tTO^(n — l)(t — l)/n] . Hence by the above 
theorem, G(n, to; t{n — 1) + 1) < f3{n, to; t{n — 1) + 1) and now the result follows 
from (5). ■ 



4 Exact Bounds for G{k, m; mn) 

Constructions of minimum group covering designs can be quite challenging. In 
this section we use the structure of covering designs, meeting certain conditions, 
to construct few classes of minimum group covering designs. In [4], Caro and 
Raphael have shown that for each k, the size of the block, there exists a minimum 
covering design. It is used to establish the existence of a minimum group covering 
design for any k. 

Recall that tc{k; to) = G{k, to; km). Using extremal set theory, Kleitman and 
Spencer [16] have proved the following theorem. 

Theorem 14. [16] Let s be a positive integer and let g{s) = ( L/21-i) ■ Then 
tc{k; 2) = min{s : g{s) > k}. 

If a covering design meeting the Schonheim bound exists then the following 
theorem gives a construction of a minimum group covering design. 

Theorem 15. Let (v — 1) = 0 (mod fc — 1), n = {v — l)/(fc — 1), nv = —2 
(mod k), and let v > k{k — 2) + 2. If there exists a minimum eovering design 
AD[k,l]v] with C{k,l',v) = a{k,l',v) then there also exists a minimum group 

covering design GG[fc, fc — 1; u — 1] with G{k, k — l;v — 1) = 

Proof. Let {Y,v) be a covering design AD[k,l]v] with 
G(fc, 1; v) = a{k, 1; v). Then by Remark 3 there exists a unique pairset {u, v} CY 
which is contained in exactly k blocks, say Pi , P 2 , • • • ,Pk and all other pairsets 
are contained in exactly one block. Observe that Pi p| Pj = {u, u} for all 1 < t < 
j < k. Let S = Uf=i-P*- Clearly | S \= k{k — 2) + 2 and Y\S yf </>. Let xo G Y\S. 
Since any pairset containing xo must be contained in exactly one block of V, 



(1; — l)(i;— fc) 
k{k-l) 
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the number of blocks containing xo is n. Let T\,.. . ,Tn be the blocks contain- 
ing xq. Clearly Q = {Ti\{xo} : 1 < i < n} is a partition oi X = F\{a;o} and 
{u,v} 2 Ti for 1 < i < n. Let B = V\{Ti,T 2 , . . . ,Tn}. Note that for each 
i? G ,8, 1 Ti Pi |< 1 for 1 < i < n. Since {Y,V) is a covering design, every 
pairset {s,t} C X such that s and t are from distinct Tpjxo} will be contained 
in a block of B. Hence (X,Q,B) is a group covering design GC[k,k — — 1] 

with G{k, k — l;v — 1) <\ B \= a{k, l;v) — n = • 

Therefore by (5), G(fc, fc — 1; u — 1) > ■ Hence the theorem follows. ■ 

If n = 0 or 1 (mod 3) and k = 3 then Hanani [14] has shown that a 
GD[3,2;2n] exists and hence G(3,2;2n) = |’2n(n— l)/3] . If n = 2 (mod 3) 
and V = 2n+ 1 then Fort and Hedlund [13] have shown that there exists a cov- 
ering design AD[3,l]v\ with G(3,l;t!) = a{k,l',v). Hence by Theorem 15 the 
following corollary is immediate. 

Corollary 3. If n > 3 then G(3,2;2n) = ]"2n(n— l)/3] . 

If n = 0 or 1 (mod 4) and k = 4 then Brouwer et al. [3] have shown that a 
GG[4, 3; 3n] exists and hence G(4, 3; 3n) = \3n{n — l)/4] . If n = 2 or 3 (mod 4) 
and f = 3n-|- 1 then Mills [19], [20] has shown that there exists a covering design 
AD[4, l;n] with G(4, l;r;) = a(4, l;w) with the exception of w = 19. Hence by 
Theorem 15 we have the following result. 

Corollary 4. // n > 4 and n yf 6 then G(4, 3;3n) = ]"3n(n — l)/4] . 

Remark 5. If n = 6, Corollary 1 gives G(4,3; 18) > 25 > ]"3n(n — l)/4]. 

If fc = 5 and v = 4n -I- 1, then Mills and Mullin [21] have shown that 
G(5,l;n) = a(5, l;f) whenever n = 2 (mod 5),n > 787 or n = 4 (mod 5), 
n > 189. In each of these cases, the hypothesis of Theorem 15 are satisfied. 
Hence we have the following corollary. 

Corollary 5. // n = 2 (mod 5),n > 787 or n = 4 (mod 5),n > 189 then 
G(5, 4; 4n) = \4n{n — l)/5] . 

Let vo{k) be as in Theorem 2 and let V 2 {k) = ma,x{vo{k),3{k — 2){k + l)/2-|- 
4}. If r > V 2 {k), the following theorem follows from Theorems 2 and 15. 

Theorem 16. Let v > V 2 {k), (r — 1) = 0 (mod fc — 1) and let 

v{v — l)/(fc — 1) = —2 (mod k). Then there exists a GG[k,k — l;w — 1] with 

Theorem 17. Let k > 3, (r — 1) = 0 (mod k—l), n = nv = —3 (mod k), 
and let v > 3{k + l)(fc — 2)/2 -|- 3. If there exists a minimum covering design 
AD[k, 1; v] with G{k, 1; v) = a{k, 1; v) then there also exists a GG[fc, k—l]v — l] 

with G{k,k - l]v - 1) = ■ 
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Proof. Let (Y,v) be a covering design AD[k, 1; w] with C{k, 1; v) = a{k, 1; v). 
Then by Remark 3 there exists three pairsets, say {x, y}, {y, z\ and {z, x}, each 
of which is contained in exactly (fc + l)/2 blocks of V (not necessarily distinct) 
and all other pairsets are contained in exactly one block. Let ,Bg 

be the blocks containing the pairsets {x,y},{y,z} and {z,x} and let S = 
Note that (fc + l)(fc - 3)/2 + 3 < |S'| < 3{k + l){k - 2)/2 + 3. Since 
V > 5{k+l){k -2)/2 + 3, X\S' yf 4>. Let xo G X\S and let Ti,T 2 ,... be 
the blocks containing xq- Let X = y\{xo}, G = {Ti\{2^o} : 1 < i < n} and 
let B = V\{Ti,T 2 , . . . , Tn}. Following the argument given in the proof of The- 
orem 15 it is easy to verify that {X,G,B) is a minimum group covering design 



containing 



(iJ — fc) 
k{k-l) 



blocks. 



Let vo{k) be as in Theorem 2 and let V 3 {k) = max{wo(fc), 3(fc — 2){k+ l)/2-|- 
4}. If > V 3 {k), by Theorems 2 and 17 we have the following theorem. 



Theorem 18. Let k > 3, v > V 3 {k), (f — 1) = 0 (mod k — 1) and let 
v{v — l)/(fc — 1) = —3 (mod k). Then there exists a GC[k,k — l;w — 1] with 



G{k, k — 1] V — 1) 



{v—l)(v—k) 

k{k-l) 



Theorem 19. (t) G(3,2;8) = 8, (ii) G(4,2;10) = 8, 

(Hi) G(5, 2; 12) = 8, (iv) G(6, 2; 14) = 7. 

Proof. We use the notation for representing a block as described in Theorem 
11 . 

(i) By (5), G(3,2;8) > 8. The following eight blocks gives the desired result. 

1110 1201 2102 2220 
1022 2011 0121 021 2 . 

(ii) By (5), G(4, 2; 10) > 8. The following eight blocks gives the desired result. 
21110 11201 12102 22220 

11022 12011 20121 2021 2 . 

(iii) By (5), G(5, 2; 12) > 8. The following eight blocks gives the desired result. 
121110 111201 112102 122220 

211022 212011 220121 22021 2 . 

(iv) By (5), G(6,2; 14) > 7. The following seven blocks gives the desired result. 
1201221 2112201 2220212 1122022 
20111220121111121211 0 . " 



5 Group Divisible Designs with Consecutive Block Sizes 

Recently Colbourn, Lenz, Ling, Rosa, Stinson and others have studied PBD’s 
with consecutive block sizes (see, [17], [11], [12] and [18]). However very little is 
known about GDD’s with consecutive block sizes. In this section we use Lemmas 
2 and 3 to construct some GDD’s with two, three or four consecutive block sizes. 
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Theorem 20. Let (X,Q,B) be a minimum group covering design 
GC[k, m; mn] satisfying m{n — 1) = 0 (mod k — 1) and — l)/{k — 1) = 

—2 (mod k) and \B\ = P{k,m;mn). Then there exist group divisible designs 
GD[{k— 1, fc}, {m— 1, m}; mn — 1] and GD[{k — 2, k—l,k},{m—l, to}; toti — 2] . 

Proof. By Lemma 2 there exists a unique pairset Pq = {xo,yo} which is con- 
tained in exactly k blocks. Let Xi = AT\{xo}, X 2 = X\Pq, Gi = {G\{a;o} : 
G G g}, Q2 = {G\Po :G€g},Bi = {B\{xo} : B € B} and let B2 = {B\Po : 
B G B}. Then it is easy to verify that (Xi,^i,Si) is a GD[{k — l,fc},{TO — 
1, to}; mn — 1] and {X2, 02 , B2) is a GD[{k — 2, fc — 1, fc}, {to — 1, to}; mn — 2]. ■ 

Theorem 21. Let n > k > 3 and let (X,g,B) be a minimum group covering 
design GG[k,m-,mn] with \B\ = /3{k,m;mn). Lf m{n — 1) = 0 (mod k — 1) 
and nm^{n — l)/(fc — 1) = —3 (mod k) then there exist group divisible designs 
GD[{k — 2,k—l, fc}, (to — 1, to}; mn — 2] and GD[{k — 3, k — 2,k — I, k}, (to — 

1, to}; mn — 3]. 

Proof. By Lemma 3 there exists three pairsets, say {x,y},{y,z} and {z,x}, 
each of which is contained in exactly (fc-l- 1)/2 blocks. Let Xi = X\{a;, y}, X 2 = 
X\{x,y,z}, g, = {G\{a:,y} : G G ^}, 02 = {G\{x,y,z} : G e 0}, B, = 
{B\{x,y} : B G B} and let B 2 = {B\{x,y, z} : B G B}. Observe that Bi 
contains blocks of sizes k — 2, k — 1 or k while B 2 contains blocks of sizes k — 
3,fc — 2,fc — lorfc. A block of size fc — 3 in ,82 will exist only when there exist 
a block containing x,y and z. It can be easily verified that (Xi,gi,Bi) and 
(-^2, 1^2) •82) are GD[{k — 2,k — I, k}, (to — 1, to}; mn — 2] and GD[{k — 3,k — 

2, fc — 1, k}, {to — 1, to}; mn — 3], respectively. ■ 

It may be observed that if mn > V 2 {k) (v3(fc)), m{n — I) = 0 (mod k — 1) 
and nmf{n — I)/(fc — I) = —2 (mod k) (= —3 (mod k)) then Theorem 16 
( Theorem 18 ) guarantees the existence of a minimum group covering design 
with (3{k,m]mn) blocks. 
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Abstract. Given a cocycle, a, the concept of a sequence being a- corre- 
lated provides a link between the cohomology of finite groups and various 
combinatorial objects: auto-correlated sequences, relative difference sets 
and generalised Hadamard matrices. The cohomology enables us to lift 
a map 0, defined on a group, to a map defined on an extension group, 
in such a way that $ inherits some of its combinatorial properties from 
those of (f>. For example, if <j> is a-correlated, $ will be the characteristic 
function of a relative difference set in the extension group determined by 
a. Many well-known results follow from choosing the appropriate exten- 
sion groups and cocycles, a. 



1 Introduction 

It is well known that many objects of combinatorial interest are intimately con- 
nected; relative difference sets, sequences with certain auto-correlation properties 
and generalised Hadamard matrices for example. Recently several authors have 
related these objects to the cohomological theory of finite groups (see [6] and 
the references therein). That there is a relationship is, in hindsight, no surprise 
because combinatorial objects such as those mentioned are often defined and 
studied in terms of a group and its extensions. Cohomology is a natural lan- 
guage to discuss group extensions. In this spirit, the work presented here will 
use ideas from cohomology to show the equivalence of the combinatorial objects 
mentioned, under certain circumstances. The equivalence is constructive show- 
ing how to pass easily from one object to another (see Theorem 3). One of the 
advantages of this approach is that the theory covers, in a single framework, 
both splitting and non-splitting extensions (see the corollaries to Theorem 3). 
Another is that construction techniques for any of the objects can be proved 
using the most convenient equivalence (see Section 4) . 

In Section 2 we introduce definitions that express certain distribution proper- 
ties of the values of a map between groups. These definitions are fundamentally 
connected to each other and to objects from cohomology, namely “cocycles” . In 
Section 3 we review the results and definitions we need concerning cocycles and 
group extensions. We then show how, when given a map between groups, </>, with 
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a certain distribution property, we can lift this map to an extension group in such 
a way that the lifted map has a similar distribution property. The lifted map 
will be the characteristic function for a relative difference set in the extension 
group. We apply this to the split extension and the natural extension. In Section 
4 we give some methods that enable us, when given such </>, to construct others. 
Finally we show that a Generalised Perfect Binary Array (see [2]) corresponds 
to a special example of the lifting of such a map (f>. 

Throughout this paper we adopt the following notation: A will be a finite 
abelian group; ZA the group ring of A over the integers; if w is a vector, Wi will 
denote its t-th component. 

2 Relative Orthogonality and Relative Correlation 

We give two definitions that describe the distribution properties of maps between 
groups. The definitions are then connected to cohomology. 

Definition 1. Let 'ip : L x L ^ A be a map of finite groups with |A| dividing 
\L\. Let H he a subgroup of L. Then we sa-y ip is orthogonal relative to H if, in 
hA, h G L — H implies 

^ip{b,j) = \L\/\A\ ^ a. 

j^L a G «4. 

Here L — H refers to the set of elements of L that are not in H. Furthermore, 
Ip is called orthogonal if it is orthogonal relative to the trivial subgroup. □ 

H is called a forbidden subgroup and we may alternatively express relative or- 
thogonality by saying that, when b is not in the forbidden subgroup, the sequence 
{ip{b,j) : j G L} has each element of A an equal number of times. 

Important in what follows is the effect of onto group homomorphisms on 
relative orthogonality. We have the following result, the first part of which we 
will use later to lift orthogonality on a group to relative orthogonality on an 
extension of that group. 

Lemma 1. Let (3 : R ^ S and 9 \ A ^ A! he onto homomorphisms of groups. 
Let Ip : S X S ^ A be a map of groups and H a subgroup of S. Then 

i) Ip o {(3 X pi) -. R X R ^ A is orthogonal relative to ker/3 if and only if ip is 
orthogonal. 

ii) Lf Ip is orthogonal relative to H then so is 0 o ip. 

Proof. We give an outline of the proof of i) . Let T be a transversal for the cosets 
of ker (3 in R. For b G R we have, upon collecting by cosets, 

^ip{j3{b),P3{r)) = \ \^er I3\^ip{l3{b), P3{t)). 

reR teT 

Now, because (3 is onto, as t runs over T then f3{t) runs over S, so this last sum 
is equal to | ker /3| 'f’{P{b),s). The result now follows. □ 
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The concept of relative orthogonality is closely connected to that of the 
“twisted relative correlation” of sequences. The nature of this connection, which 
we prove in the next lemma, enables results concerning cohomology theory and 
relative difference sets to be interpreted in terms of correlation of sequences. We 
firstly need some definitions. 

Definition 2. Let (j) : L ^ A and a : L x L ^ A he maps of groups with |^| 
dividing \L\. Let H he a subgroup of L. Then we say (p is a-correlated relative to 
H if, in the group ring l^A, h € L — H implies 

^a(6,j)</>(j)((?i(6j))"^ = \L\/\A\ a. 
jeL a^A 

Further, if H is the trivial subgroup we omit the phrase “relative to H” and if a 
is identically 1, we replace “1- correlated” by “correlated” . □ 

So, in the definition above, we think of <p as being auto-correlated when the 
auto-correlation function has been “twisted” by a. 

We will use the following notation from cohomology theory (see [4]). If 
(p \ L ^ A is & map of groups, the coboundary dp : L x L ^ A is defined 
by dp{m,n) = p(jn)p{n){p(jnn))~^ , \/m,n G L. We can now establish the con- 
nection mentioned above. 

Lemma 2. With the notation of the above definition, p is a-correlated relative 
to H if and only if adp is orthogonal relative to H. 

Proof. For 6 € T, we have 

Y(^{b,j)dp{b,j) = p{b)Yo:ib,M{j){(t>{bj))~^- 

j&L j&L 

Noting that p(b) “ = Wb))~^ T>a&A we obtain the result. 

□ 

3 Cohomological Aspects of Relative Orthogonality 

For the remainder of this paper let G be a group of order v and we suppose 
w I V, where w denotes the order of A. When the “twisting” function a in the 
definition of a-correlation is of a special form there is an equivalence between 
the existence of a-correlated maps, generalised Hadamard matrices and relative 
difference sets in extension groups determined by a. The special form needed is 
that a is a cocycle (strictly a two dimensional cocycle with trivial action). We 
will define this shortly, but the following result of Perera and Horadam indicates 
the sort of relationship between cohomological and combinatorial objects that 
we are interested in. A w x w matrix, G, of elements of A, indexed by the elements 
of G (in some fixed order), is a {v / w)-generalised Hadamard matrix if, whenever 
\ < i ^ k < v, the list Cijcfj, 1 < j < v has each element of A exactly v/w 
times (see [6] or [1]). 
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Theorem 1. [6, Lemma 2.2] Let 'ip : G x G ^ A be a cocycle and form the 
matrix = [xp{x,y)] indexed by the elements x,y G G (in some fixed order). 
Then ip is orthogonal if and only if M.^ is a {v / w)- generalised Hadamard matrix. 

□ 



3.1 Cocycles and Central Extensions 

We summarise the definitions and results we need on cocycles and central ex- 
tensions (for proofs see [4, Chapter 2]). We call the map a ■. G x G ^ A & 
cocycle if a(l, 1) = 1 and \/x,y,z G G it satisfies the equation a{x,y)a{xy, z) = 
a{y, z)a{x, yz). A consequence of this equation is that \/x G G we have a{x, 1) = 
a(l,a:) = 1. The abelian group of all such cocycles under the multiplication 
(aa'){x,y) = a{x,y)a' (x,y) is denoted Z^{G,A). It (p : G ^ A is such that 
(()(1) = 1, the coboundary defined by d(p{x,y) = (p{x)(p{y){(p{xy))~^ G G is 
in Z“^{G,A). If a, a' G Z^(G,A) and a = (A dp) for some (p we say a and A are 
cohomologous and write a ~ ab This is an equivalence relation and the group 
of equivalence (cohomology) classes, a, is denoted H^{G,A). 

Cocycles determine and are determined by central extensions. Consider a 
central extension, R, of G by A; that is consider a short exact sequence, 

1-^A^ R-^G^l, (1) 

where l{A) = ker /3 is a subgroup of the centre of R. We take a section. A, of 
/3, that is a map X : G ^ R such that (3{X{x)) = x, Vx G G and A(l) = 1. 
Then A defines a cocycle f\ G Z‘^{G,A) by L{f\{x,y)) = A(x)A(j/)(A(xy))“^ . 
We note that A(G) is a transversal for the cosets of i{A) in R. Equivalently we 
could define A given such transversal. Different choices of A lead to cohomologous 
cocycles. Conversely, given a cocycle a G Z^(G, A) we have the central extension 

1 ^ ^ G^ 1, (2) 

where Ea is the group { (a, x) : a G A^x G G} with a “twisted” multiplication 
defined by (a,x){b,y) = (aba{x,y),xy) and i'{a) = (a, 1), P'{a,x) = x. We refer 
to this as the standard extension for a. We also refer to the section of j3' given 
by A'(x) = (l,x) as the standard section for a since it determines the cocycle 
fy = a. 

3.2 Extensions of Set Maps 

Consider the central extension (1) and a fixed section. A, of (3. In this part we 
take a map (p : G ^ A, with ())(I) = 1, and define an extension to a map from R 
to A. This extension will preserve some of the relative correlation properties of 
the original map. When p> has certain correlation properties the extension will 
prove to be a characteristic function of a relative difference set in R. We will 
establish this in the next part. 
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Any r G R may be written in the form r = t(ar)X{xr) where Xr = (3{r) 
and Or G A is unique. Therefore, given (j> : G ^ A we define the extension, 
(I’x ■. R ^ A, oi 4> hy 

<P\{r) = ar~^(l){Xr). (3) 

If the central extension and section are the standard ones for a € Z^(G,A) we 
write (Pa for (p\r. We have the following properties for the extension function. 

Lemma 3. ij {r G R : ^\{r) = 1} = {t((/!)(a;))A(x) : x G G}; 
a) d^x = if\d^) o{p X f3). 

Proof. Only ii) needs any work. For r, s G R we see Xrs = XrXg and, because 
l{A) is in the centre of R, we also have Ors = arasf\{xr,Xs). The result now 
follows from the definitions of <P\ and d<P\. □ 

3.3 Relative Difference Sets and Cocycles 

We firsly define a relative difference set (rds). Suppose M is a group of order vw 
and N a subgroup of order w where w | u. A w element subset, D, of M is called 
a {v,w,v,v/w)-vds in M relative to N if the list didf^, d\,d 2 G D contains no 
elements of N and each element oi M — N exactly v/w times. The fundamental 
result connecting such relative difference sets and cocycles is also due to Perera 
and Horadam. 

Theorem 2. [6, Theorem 4.1] Let if G Z‘^{G,A). Then ip is orthogonal if and 
only if {(l,x) : x G G} is a (v,w,v,v/w)-rds in Ey, relative to Ax 1. □ 

We are finally in a position to prove our main result which links the relative 
orthogonality and correlation properties of a map and its extension with a char- 
acteristic function for a relative difference set. Recall that |G| = v, |A| = w and 
w I V. 

Theorem 3. Consider the central extension, (1), and a section, X, of (3 . Let 
4> : G ^ A be such that </>(!) = 1 and let <P\ be defined by (3). Then the 
following are all equivalent: 

i) f\d(p is orthogonal; 

ii) D = {r G R : <P\{r) = T\ is a {v,w,v,v/w)-rds in R relative to i{A) = ker/3; 
Hi) d<P\ is orthogonal relative to i{A); 

iv) T>x is correlated relative to i{A); 

v) (p is fx-correlated. 

Proof. The equivalence of i) and iii) follows from Lemmas 1 and 3; that of iii) 
and iv) and also that of i) and v) from Lemma 2. We need only show that i) 
and ii) are equivalent. Let ip = fxd<P- There is an isomorphism P : Ey, R 

given by E{a,x) = L{acp{x)) X{x) (see [4] for this). We see that T(a, 1) = t{a). 
Let D* = {(l,a;) : x G G} C Therefore P{D*) = {l{4>{x))X{x) : x G G} and, 
by Lemma 3, D = E{D*). Applying the isomorphism T, Theorem 2 tells us that 
the orthogonality of ip is equivalent to D = P{D*) being a {v,w,v,v/w)-rds in 
P{E^) = R relative to E{A x 1) = l{A) = ker (3. □ 
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In view of part ii) of the above equivalence, we can regard as a, characteristic 
function for the relative difference set X{x) : x G G} in R. 

If we begin with a cocycle and use the standard extension, (2), and standard 
section for that cocycle we obtain the following corollary. 

Corollary 1. Let a € Z^(G,A) and (j) : G ^ A with (p{l) = 1. Then the 
following are equivalent: 

i) ad(j) is orthogonal; ii) <Pa{a,x) = a~^)>(x) is correlated relative to A x 1; Hi) 
(f) is a-correlated; iv) {{4>{x),x) : x G G} is a {v,w,v,v/w)-rds in Ea relative to 
Axl. □ 

We note that instead of starting with a map (j) we could equally well begin with 
a {v,w,v,v/w)-rds D, in Ea, and define 4>{x) = a for (a, x) G D. The map <)) is 
well-defined on G and would be a-correlated. 

When we choose a = 1 in the above corollary we obtain some well known 
results on splitting relative difference sets, because i?i = ^ x G. 

Corollary 2. i) (f> is correlated if and only if the set {{(j){x),x) : x G G} is 
a {v,w,v,v/w)-rds in A x G relative to Ax 1. ii) If v = w, let b G G and 
Ah, 4 , : G — > ^ he the difference operator Ah,,p{x) = 4>{x){(f>{bx))~^ . Then Ah,,p is 
onto (equivalently one-to-one) for every 6 yf 1 if and only if {{(j>{x),x) : x G G} 
is a {v,v,v, l)-rds in Ax G relative to Ax □ 

As an example of the last result; if G is the additive group of the field GF(< 7 ), 
q odd, then for any e ^ 0,g,h G GF(< 7 ) there is a “quadratic” (w, w, w, l)-rds in 
G X G relative to G x 0, namely {(ex^ -\- gx-\- h,x) : x G G}. This is also proved, 
basically, in [7]. 

As we have seen the split extension provides certain examples. So, too, does 
the natural extension. Let M,N be groups of order, respectively, vw,w with N 
in the centre of M. Let D be a transversal for the cosets of iV in M with 1 G D. 
The natual short exact sequence 

N ^ M M/N 1, 

with 7r(m) = niN and section X{dN) = d, d G D, defines a cocycle fu G 
Z‘^{M/N, N) as follows. For d,d' G D we have dd'N = d*N for unique d* G D. 
So we define foidN^d' N) = dd'{d*)~^. We take the map () on M/N to be 
identically 1 and extend it to M as before. Write r G M uniquely as r = 
drUr, dr G D, Ur G N and define <P\{r) = drV~^ = We now have the 

following by using Theorems 1 and 3. 

Corollary 3. For N central in M and a transversal D, with 1 G D, the following 
are equivalent: i) D is a {v,w,v,v/w)-rds in M relative to N; ii)‘L>\{r) = drV~^ 
is correlated relative to N; Hi) fjy is a {v / w)- generalised Hadamard matrix. □ 

The construction of a generalised Hadamard matrix from a relative difference 
set appears in [3]. We have proved the equivalence of these objects in the case 
of the parameters and groups above. 
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4 Base Sequences and Generalised Perfect Binary Arrays 

We have seen that maps that are auto-correlated when the correlation func- 
tion is “twisted” by a cocycle can be lifted to produce relative difference sets 
in extension groups. In view of this we call (j) : G ^ A, where </>(!) = 1, a 
base sequence with respect to a G Z'^{G,A) if it satisfies any of the equivalent 
conditions of Corollary 1. In the special case of a being symmetric (that is 
a{x,y) = a{y,x), Vcc,y G G), G being an abelian group and A having order 
2, base sequences have been studied under the name Generalised Perfect Bi- 
nary Arrays, or GPBAs, by Jedwab and others (see [2]). We will discuss this 
correspondence later. 



4.1 Construction of Base Sequences 

Jedwab [2] gives many construction techniques for GPBAs. Some of the tech- 
niques seem specific to the situation he is studying, but many work in more 
general circumstances. We present some of these. By virtue of Theorem 3 these 
techniques give constructions for rds in extension groups. 

Theorem 4. For a € Z^(G,A) and (j) : G ^ A, with (()(1) = 1, we have: 

i) Ifa is cohomologous to a' G Z‘^{G,A), say a = a' dp,, then 4> is a base sequence 
wrt a if and only if pef is a base sequence wrt a' ; 

ii) If 0 : G' ^ G is an isomorphism, 4> is a base sequence wrt a if and only if 
4>o 0 is a base sequence wrt a o (0 x 6>) G Z‘^{G' ,A); 

Hi) Let 4>i G\ ^ A and <f >2 '■ G 2 ^ A. Define the tensor product, </>i </> 2 , to 

be 4>i 4>2 {xi,X2) = 4>i{^i)4'2{x 2)- Similarly given a\ G Z‘^{G\,A) and 02 G 

Z‘^{G 2 ,A) define a\ ® 02 - Then </>i ® 4>2 is a base sequences wrt a\ ® 02 if and 
only if (pi, 4>2 are base sequences wrt 01,02 respectively. 

iv) Let 12 : A ^ A' be an onto homomorphism. If p is a base sequence wrt a 
then [2 o p is a base sequence wrt 17 o o G Z‘^{G,A'). 

Proof. These are most easily seen using the definition: 0 is a base sequence wrt 

0 if and only if adp is orthogonal. All the parts except iii) follow from Lemma 

1 with the following observations: in i) adp = a'd{pp); in ii) {adp) o (0 x 0) = 
a o (0 X 0)d(p o 0); in iv) 17 o (adp) = (17 o a)d(I2 o p). It remains only to 
prove iii). Suppose p = (oi 0 a2)d{pi P2) is orthogonal. Let pi = aidpi and 
1 yf x' G Gi . Then 

X! i’iix A),{x,y)) = ^ pi{x,x).l = I G 2 I ^ V'i(a;',x). 

(a:,y)eGixG2 (a:,y)eGixG2 x^Gi 



From the definition of orthogonality we deduce pi is orthogonal. An equivalent 
proof works for p 2 . The converse can be proven by a similar argument but is 
also a consequence of a result on the Kronecker product of generalised Hadamard 
matrices (see [6, Theorem 5.1]). □ 
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Part i) of the preceeding theorem tells us that if we want a base sequence 
wrt some cocycle, we may as well assume the cocycle is a representative of a 
cohomology class in If G = x • • • x Zg,, is abelian, representatives 

for cohomology classes of symmetric cocycles are easy to describe. Indeed, let 
Ext{G,A) = {a G H‘^{G,A) : asymmetric}, then Ext{G,A) = Y\i j'^{si,tj), 
where A = 1>ti x • • • x Zt^. and {si,tj) refers to the greatest common divisor of 
Si,tj (see [4] for this). In view of this fact, and the theorem above, if we seek a 
base sequence wrt some symmetric cocycle when G is abelian, we may as well 
assume G = Zps , A = Z^t for primes p,q. If p ^ q then we may take the cocycle 
to be a coboundary and we are in the situation of Corollary 2. 

4.2 GPBAs 

We will now outline how we may regard a GPBA as a base sequence wrt to a 
very specific cocycle. Full details of proofs are given in [5] . We take the definition 
of GPBA from [2]. Let G = Zg^ x • • • x Zg^, A = {±1} be a group of order two, 
z = (zi, . . . ,Zr) where z* = 0 or 1, and s = (si, . . . , Sr). Also, let </> : G ^ A be 
a map of groups with (/>(!) = 1. If z = 0 then (j) is called a GPBA{s) of type 0, or 
a PBA(s), if it is correlated. When z ^ 0 a more involved definition is needed. 
We define yet more groups: 

G* = ^(zi + l)si X • • • X 1i(zr + l)Sr-- 

Thus, the arithmetic in the i-th coordinate of G* is mod 2si or mod Si accord- 
ing as Zi = 1 or Zi = 0. Further, define the following subgroups of G*, 

= {h G G* : /li = 0 if Zi = 0; hi = 0 or Si if Zj = 1}, 

K = {k G H :k has even weight}. 

We now define an extension of 4> to G* . We may write any g G G* uniquely 
in the form g = x -|- h where x G G and h G i? by taking x = g mod s = 
{gi mod si, . . . ,gr mod Sr) and h = g — x. Here gi mod Si refers to the unique 
residue in the range 0, . . . , Sj — 1. Now put 

e(g) = /“W ifhGA 
} -a(x) if h ^ A. 

Finally, then, (f is called a GPBA{s) of type z yf 0 if e is correlated relative to El. 

The concept of GPBA relates to the ideas in the earlier part of this paper 
by taking the short exact sequence 

A^G*/K ^G^O, 

where t. is the homomorphism l{A) = ELj K and l3 the homomorphism g -|- A — > 
g mod s. Using A(x) = x + K as a section of (3 we have, in the notation of the 
previous section, e(g) = <l’\{g + A). We may prove from the results earlier that 
e is correlated relative to H if and only if <P\ is correlated relative to H/K. The 
section A defines a cocycle in Z^(G, A), which we call fj, in the usual way and 
we have the following result. 
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Theorem 5. [5, Theorem 5.6] For any •z, 4> is a GPBA{s) of type z if and only 
if (j) is a base sequence wrt fj. □ 

Finally we note that, because of the isomorphism of Ext{G,A) mentioned 
above, fj depends only on s and z, and in a particularly simple manner (for 
details see [5]). 
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Abstract. This paper presents an example of integrating recently de- 
veloped finite-field wavelet transforms into the study of error correcting 
codes. The primary goal of the paper is to demonstrate a straightfor- 
ward approach to analyzing double circulant self-dual codes over any 
arbitrary finite-field nsing orthogonal filter bank strnctures. First, we 
discuss the proper combining of the cyclic mother wavelet and scaling 
sequence to satisfy the requirement of self-dual codes. Then as an ex- 
ample, we describe the encoder and decoder of a (12,6,4) self-dual code, 
and we demonstrate the simplicity and the computation rednction that 
the wavelet method offers for the encoding and decoding of this code. 
Finally, we give the mother wavelet and scaling sequence that generate 
the (24,12,8) Golay code. 



1 Introduction 

Although wavelets and filter banks over real or complex fields have been studied 
extensively for years, the finite-field wavelet transform has received little atten- 
tion because of the very limited application of this transform in the signal and 
image processing area. In the past there was some interest in extending wavelet 
transforms to the situation in which the complex field is replaced with a finite 
field. In [I] the authors show that unlike the real field case, there is no complete 
factorization technique for paraunitary filter banks (FB) over GF(p), for p a 
prime, using degree-one and degree- two building blocks. Relying on the Fourier 
transform defined over GF{p^), the authors of [2] construct a wavelet transform 
for finite dimensional sequences over fields with a characteristic other than 2, 
p 2. In [3] an alias cancellation approach was used to design two-band filter 
banks over finite fields. The main problem with this approach was excluding 
fields of characteristic two, GF(2’’) for r > 1, in which an element of order two 
does not exist. An extensive review of finite field transforms can be found in 
[4] . Recently, a formulation of the wavelet decomposition of a vector space over 
any finite- field has been derived in [5]. Since that formulation relies on a basis 
decomposition in the time domain rather than in the frequency domain, it does 
not require the existence of the number theoretic Fourier transform. Thus it be- 
comes more attractive, particularly for the fields GF{2^) having characteristic 
2 . 
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The objective of this paper is to present an example of our attempt to bring 
together finite-field wavelet transforms and error correcting codes and to study 
error control coding in the signal processing context. This unified view uncovers 
a rich set of signal processing techniques that can be exploited to investigate new 
error correcting codes, and to simplify the encoding and decoding techniques of 
some existing codes. 

Self-dual codes have been widely studied [6]. In this paper, first we discuss 
a model to generate double circulant self-dual codes over any arbitrary finite- 
field using the notion of finite-field wavelet transform. Then as an example, we 
present an encoding and decoding method for the (12,6,4) binary self-dual code. 
More in-depth study of double circulant self-dual codes using cyclic orthogonal 
and biorthogonal wavelet transforms can be found in [7]. 

Let be the vector space of A^-tuples over the finite- field F. Then a self- 
dual code of length N is a subspace C of F^ such that C = C-^, where C-^ is 
defined as: 

C-L = {ueF^ : (v,c) = 0 VceC} (1) 

Clearly, C has dimension M = N/2 and is called a {N, M, d) code where d is the 
minimum distance of the code. Since the primary focus of this paper is on linear 
block codes, before talking explicitly about self-dual codes, we briefly summarize 
some results on cyclic wavelet transforms. 



2 Cyclic Wavelet Transforms over the Field F 

It is well known that wavelet decomposition and reconstruction can be imple- 
mented as the analysis and synthesis components of a perfect reconstruction 
filter bank, respectively. Figure 1 shows the analysis and synthesis bank of a 
two-channel perfect reconstruction filter bank in which the synthesis filters go(n) 
and gi(n) are the scaling sequence and mother wavelet, respectively. In [5] the 
authors show how to decompose a vector space V over a finite-field F onto two 
orthogonal subspaces Vo and Wq • Particularly, to have a two channel perfect re- 
construction orthogonal filter bank, a design methodology is presented to obtain 
the analysis and synthesis filters over the fields of characteristic. 

Since the codewords c(n) of self-dual codes have finite even length, the vec- 
tor space V is considered to be a vector space of finite dimension N = 2M. 
It can also be regarded as a space of periodic sequences of period N. Now, 
consider a two-channel perfect reconstruction orthogonal filter bank with the 
scaling sequence go{n) = { 30 ( 0 ) 5170 ( 1)5 • • • ,go{N — 1)} and the mother wavelet 
17i(’^) = {l7i(0),3i(l), • • • 5^1 (IV— 1)}, where the mother wavelet is the time re- 
verse of the scaling function gi{{n))N = go{{—n ~ l))v- It is worth noting that 
throughout the paper ((-((v denotes a modulo- operation, or equivalently an 
A^-point circular shift. Furthermore, the analysis filters are related to the syn- 
thesis filters by: 



hj{{n))N = gj{{-n))N J = 0,1 n = 0, . . . , A7 - 1. 



( 2 ) 
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In the study of block codes, we will frequently use the fact that the algebra 
of M X M one-circulant matrices over the field F is isomorphic to the algebra 
of polynomials in the ring ¥[z~^\/ {z~^ ~ !)• This isomorphism allows us to 
simplify the proofs of the relations using matrix notation instead of ^-transforms. 
Therefore, we introduce a matrix representation to express the relations of the 
cyclic filter bank. In the analysis bank, the filtering operation of periodic signals 
followed by decimation by a factor of two can be described using 2-circulant 
matrices. In other words, let Hq ■. V ^ Vq and Hi : V ^ Wo be two linear 
transformations (M x N matrices in matrix notation) that project a codeword 
c(n) G V onto two orthogonal spaces Vo and Wo with wavelet coefficients xo{n) 
and xi(n), respectively. Then we have: 



N-l 

^o{n) = ^ c{i)ho{{2n - z))at = {Hoc){n) 

i^O 

N-l 

= XI c{i)hi{{2n - i))N = (i^ic)(n), 
2 = 0 



( 3 ) 



in which Hq and Hi are 2-circulant matrices. By using (2), these matrices can 
be written as: 



5j(0) 3j(l) ffj(2) 

gj{N-2) g,{N-l) g,{0) 



L 9j{‘^) 9j{^) 9j{^) 






ffi(l) J 



J = 0, 1. (4) 



Similarly, in the synthesis bank, the upsampling of periodic signals by a 
factor of two followed by the filtering operation can be described by 2-circulant 
matrices Go and Gi: 



M-l 



M-1 



= X! ^o(i)5o((n - 2 i))at + ^ xi{i)gi{{n - 21))^ 



i=0 



= (GoXo)(n) -I- (Gia;i)(n). 



i=0 



( 5 ) 



Because of the relation (2), the synthesis matrices turn out to be the trans- 
poses of the analysis matrices: 

G, = Hj j = 0,l. (6) 

From the perfect reconstruction constraint we deduce that: 



[Hi Hi 



Ho 

Hi 



= L 



NxN- 



( 7 ) 



Furthermore, since the operator T = [Hi Hiy is a 1-to-l mapping V ^ Vo x Wo 
in the finite-dimensional vector space, consequently it is onto as well. Hence 
T^T = I implies that TT^ = I (i.e, T is unitary). Therefore: 

HjHJ = Imxm j = 0, 1 and HoHI = Omxm- 



(8) 
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Analysis Bank 



Synthesis Bank 



Fig. 1. Diagram of the two-band filter bank. 

3 Double Circulant Self-Dual Codes 

Assume that C G is a double circulant self-dual code. The double circulant 
property requires that if c(n) = {c(0),c(l), . . . ,c(N — 1)} is a codeword in C, 
then c((n — 2))at is also in C. Let c{n) be a codeword that is decomposed by 
the analysis bank of Fig. 1 to its wavelet coefficients xo{n) and xi{n). Although 
we do not need the result of the following theorem in this paper, we quote the 
following theorem from [7] to maintain the generality of the discussion. 

Theorem 1. Suppose the codeword c(n) G C corresponds to the message block 
m{n), then for any double circulant self-dual code C there exists a cyclic orthog- 
onal wavelet transform that maps the codeword c{n) to the message block, i.e, 
xo{n) = xi{n) = m{n). 

3.1 Encoder Structure 

Figure 2a shows the encoder of a {N, M, d) code that maps the message block 
m{n) of size M = N/2 to the codeword c(n). The encoder is realized by the 
synthesis portion of the two-band filter bank in which go(n) and gi(n) are an 
orthonormal wavelet basis over F. The parameter a € F introduced in this 
structure will be specified later to meet the self-dual constraint. The delay 
0 < I < M — 1 controls the minimum distance of the generated code, 
and will also be discussed later. From the linearity of the wavelet transform, it 
is obvious that the generated code is linear. Furthermore, if c(n) is a codeword 
associated with the message m{n), then by the property of the multirate filters, 
there exists a message datum m((n — 1))m that is mapped to the codeword 
c((n- 2))at. Therefore, the code is double circulant. It is also worth noting that 
using the combined scaling sequence and the mother wavelet (synthesis bank) as 
the encoder, guarantees that the mapping of the message m(n) onto the code- 
word c(n) is a one-to-one mapping. This is true because the message block can 
be extracted by filtering c(n) through ho(n) and downsampling it by a factor of 
two. In the following, we show that regardless of the amount of the delay I, the 
encoder generates a double circulant self-dual code in F. 

Using the matrix notation that we developed for the cyclic filter bank in 
Section 2, the N x M generator matrix G of this code can be written as: 



G = Go + Gi77q 



(9) 
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in which 77; is an M x M monomial matrix, which is a permutation of the 
identity matrix if the field is GF(2'’). In fact 77; is a one-circulant matrix defined 
by its first row that is zero everywhere except the (I + 1) position. Due to 
the isomorphism of the algebra of one-circulant matrices and the algebra of 
polynomials, the following statements can be readily proved. First, one can show 
that is also a one-circulant matrix whose first row is zero everywhere except 
at the (M — ^ -I- 1) position. Furthermore, it can be proved that 

nf III = ( 10 ) 

Now, recalling the necessary and sufficient condition for self-dual codes, it 
is deduced that the M columns of the generator matrix G that are linearly 
independent must also lie in the dual space. Since dim C = dim C~‘~ = M, then 
the columns of G specify a basis set for the dual space as well. Consequently the 
generator matrix and the parity-check matrix of the code are the same. From 
the above argument we conclude that the if and only if condition of the self-dual 
codes is equivalent to G^G = 0. Using (9), (6), (8), and (10), we derive the 
following equality for the generator matrix of the encoder (Fig. 2a): 

G^G = I + a'^I. (11) 

Consequently, to meet the self-dual condition, we require that -I- 1 = 0 for 

a e F. 



channel error: e(n) 




(a) Encoder Structure (b) Syndrome Generator 



Fig. 2. Filter bank structure of the encoder and syndrome generator. 



Our previous discussion of self-dual codes is valid over any arbitrary finite- 
field. Now, let us study the (12,6,4) code as an example in the binary field. As 
explained in [5], we choose the mother wavelet to be gi{n) = {100010010101}. 
Therefore, the scaling sequence and the analysis filters are obtained by the rela- 
tion go{{n))N = gi{{-n - l));v and (2) as: 

go{n) = {101010010001} 

holn) = { 110001001010 } ( 12 ) 

hi{n) = {110101001000}. 

The parameter a in GF{2) is equal to one, and the choice of the delay I deter- 
mines the minimum distance of the generated code. Choosing the delay I from 
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the set {1, 2, 4, 5} generates a code with a minimum distance four, while the other 
choices reduce the minimum distance of the code to two. It is worth noting that 
the codes generated by different values of the delay from the set {1,2, 4, 5} are 
all equivalent to each other. By using any delay value from this set, all of the 
codeword weights are a multiple of two and the weight numerator of the code is: 

^4 = 15 , ^6 = 32 , As = 15 , Ai2 = 1, (13) 

in which Ai denotes the number of codewords of weight i. In the rest of the 
paper, we choose delay I to be one. 



3.2 Syndrome Generator and Decoder Structure 

In the following we show that the structure in Fig. 2b constructs the syndrome of 
the code. Again, by using the relations that we developed for cyclic filter banks, 
we write: 

s{n) = {Ho + n[Hi){c + e){n), (14) 

in which e(n) is the error pattern due to the communication channel. Given the 
self-dual requirement -I- 1 = 0, the above equality is simplified further by 
combining the equations (6), (8), and (10) as: 

s(n) = {Ho + nf Hi)e{n). (15) 

Therefore, the output of the system in Fig. 2b depends only on the error pat- 
tern. This structure can be simplified as in Fig. 3 in which h 2 {n) = ho{n) + 
hi{{n + 2l))^. Now, the remaining problem is to interpolate the low {M) dimen- 
sional syndrome s{n) into the higher {N) dimensional error pattern e(n). This 
is a conditional minimum distance estimation problem in the signal processing 
context. It is conditional because more than one error pattern is mapped into 
the same low dimensional syndrome. Therefore, the interpolator should choose 
(out of those possible valid choices) the error pattern that is most likely (has 
minimum weight) to achieve the maximum likelihood ML decoder performance. 



channel error: e(n) 




(a) Syndrome Generator 



e(n) 




(b) Polyphase Structure of the Syndrome Generator 



Fig. 3. Polyphase representation of the syndrome generator. 



Since the estimator is a signal dependent operator, there is no single operator 
solution for this estimation problem. Hence, to keep the decoder as simple as 
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possible, we design the interpolator to be exact for all the weight one error 
patterns. Therefore, like the ML decoder, this decoder is guaranteed to correct 
all errors of weight one. Our approach to design the decoder is based on inverting 
the polyphase filters of the syndrome generator filter /i 2 (n). Figure 3b shows the 
polyphase structure of the syndrome generator in which uoo(n) = h 2 ( 2 n) and 
uoi(n) = h2{2n + 1) are the polyphase components of h 2 {n). Furthermore, let 
roo(n) and roi(n) be two filters with the z-transform satisfying: 

Roi(z)Uo 2 (z) = 1 mod - 1) f = 0, 1. (16) 

In other words, these two filters are the circular inverses of uoo(n) and uoi(n), 
respectively. Now, let us define two sets Eq and Ei by distinguishing those errors 
that occur only in the even time indexes from those occur in the odd time indexes, 
respectively: 

Flo = {e(n) = {Co,0,Ci,0,-- • ,Cm-i, 0} where: Ci G {0, 1} V z} 

El = (e(n) = {0,Tfo,0,r/i,- ■ ■ where: ? 7 i G {0, 1} V z} . ^ 

If e(n) G Eq, then soi(zi) = 0 and s(n) = soo(n). Therefore, we are required to 
invert the filter zxoo(zz) to estimate the error. By this argument, we realize that 
whenever e(n) G Eq, the output at position 1 in Fig. 4 is the exact interpolation 
of the syndrome to e(n). 




Conditional Interpolator 



Fig. 4. Filter structure to reconstruct the message sequence. 



Taking into the account that half of the weight one errors belong to the set 
El, we need to identify these cases and be able to correct these errors as well. 
To do that, let us define the set En as : 

Ell = {e(zz) : e(n) G i?i & wt{e) = 1} (18) 

in which wt{e) means the weight of the error. It is clear that if e(rz) G En, 
then s{n) = zzoi((n — no))M (no depends on the location of the ’1’ in e(n)) 
which generates a weight five output at node 1. Hence, whenever the weight 
of the output at node 1 is five, we select the output at node 2 as a correct 
estimate of the error. It is worth noting that weight five outputs at node 1 are 
also produced by errors from the set Eq 5 = {e(n) : e(n) G i?o & wt(e) = 5}. 
Consequently, the correctable errors by the decoder of Fig. 4 are from the set 
^ = {£-0 U Ell} — Eo5- 
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Table 1. The number of correctable errors (by error weight) by the filter decoder 
in Fig. 4 and the ML decoder. 



Weight of the Error 


Filter Method 


ML Decoder 


1 


12 


12 


2 


15 


31 


3 


20 


20 


4 


15 


- 


6 


1 


- 



Table 1 gives the correctable errors by the filtering technique and ML decoder. 
In the following we investigate the amount of extra signal-to-noise ratio (SNR) 
that is required by the filter method to achieve the same performance as the 
ML decoder. Suppose pb is the probability of a bit error and Perr is the word 
error probability. Then we can determine the word error rate by applying the 
formula Perr = 1 — ~ Pb)^~'' in which j3i is the total number of 

correctable errors of weight i (given in Table 1). To plot the word error rate as a 
function of SNR, we choose DPSK and noncoherent FSK detection methods in 
which Pb is and (l/2)e“‘®^^/^, respectively. These graphs show that 

a very subtle extra SNR is required by the filter decoding method to achieve the 
performance of the syndrome table lookup decoder. 

It is important to note that the (12,6,4) self-dual code has been studied only 
as an example, and the wavelet method described in this paper can be used to 
generate self-dual codes of any length. As a final remark we give the generator 
filters of the two double circulant even self-dual codes (8,4,4), and (24,12,8) 
[7]. The (8,4,4) code is generated by a cyclic orthogonal filter bank with the 
scaling sequence and the mother wavelet equal to go = {QD} and gi = {4C}, 
respectively. Similarly, the cyclic orthogonal filter bank that generates the Golay 
code (24,12,8) is constructed by go = {A80011} and gi = {401? £>55} (with no 
need for delay element in Fig. 2a). Note that the filter coefficients in GF{2) are 
represented in hexadecimal form. Furthermore, there exist several other pairs of 
scaling and mother wavelets that generate equivalent codes [7]. 

4 Conclusion 

We report a new approach to study and implement self-dual codes by using finite- 
field wavelet transforms. This method allows us to construct double circulant self- 
dual codes of arbitrary length over any finite-field in a straightforward manner. 
We also introduce a decoder based on a polyphase filter inversion methodology. 
This decoder achieves nearly the performance of the syndrome table lookup 
decoder. Our approach, in addition to being a powerful tool to investigate error 
correcting codes, reduces the complexity and computation costs in the encoder 
and decoder by the polyphase implementation of the multirate filters. 
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Noncoherent FSK Method 




4 5 6 7 8 9 10 11 12 13 

SNR (db) 



Fig. 5. Comparison of the word error rate of the Filter decoding technique with 
that of the ML decoder method in DPSK and noncoherent FSK. 
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Abstract. We give a short survey of the results obtained in the last sev- 
eral decades that develop the theory of linear codes and polylinear recur- 
rences over finite rings and modules following the well-known results on 
codes and polylinear recurrences over finite fields. The first direction con- 
tains the general results of theory of linear codes, including: the concepts 
of a reciprocal code and the MacWilliams identity; comparison of linear 
code properties over fields and over modules; study of weight functions 
on hnite modules, that generalize in some natural way the Hamming 
weight on a finite field; the ways of representation of codes over fields by 
linear codes over modules. The second one develops the general theory of 
polylinear recurrences; describes the algebraic relations between the fam- 
ilies of linear recurrent sequences and their periodic properties; studies 
the ways of obtaining “good” pseudorandom sequences from them. The 
interaction of these two directions leads to the results on the representa- 
tion of linear codes by polylinear recurrences and to the constructions of 
recursive MDS-codes. The common algebraic foundation for the effective 
development of both directions is the Morita duality theory based on the 
concept of a quasi-Frobenius module. 



Introduction 

The theory of linear codes over finite fields is a well-established part of discrete 
mathematics. The highlights of this theory include concepts of dimension, dual- 
ity, weight enumerators and their generalizations, MacWilliams identities, cyclic 
codes, etc... [1,21,43,63,73]. The theory of linear recurrences (LR) over fields is 
also well developed; it has rather more ancient historical roots and important 
applications in various areas of mathematics (theory of algorithms, Monte Carlo 
method, cryptography). In particular, the analytical formulae for the general 
member of an LR were deduced, as well as the algebraic relations between the 
families of LR, cyclic types of such families over finite fields, the distribution 
laws for elements on the cycles of some LR and the ways of obtaining from 
them pseudorandom sequences (see [79,41,33,36] and the sources cited there). 
We point out also some papers on the properties of polylinear sequences over 
Galois fields: [44,61,65,66,67,68]. 

The attempts to extend the mentioned results to some class of finite rings 
more general than that of fields for long time dealt mostly with integral residue 
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rings [2,3,4,6,71,72,74,75], and in the last years especially with the ring Z4 (see 
e.g. [46,47,20,7,30,57,76]). 

The last works finally proved the actuality and importance of the investiga- 
tions of linear codes and polylinear recurrences over arbitrary finite rings and 
modules that became so active in ’70-’90th [14,22,23,24,26,27,28,32,33,36,39,45], 
[48,49,52,53,54,55,58,59,60,70,77,78,62]. It turned out that sufficiently deep gen- 
eralization requires not the absence of zero divisors in the base ring but the so 
called double annihilator relations in the module (see below p.1.1). This leads 
to the concepts of quasi-Frobenius and reciprocal module, providing a nice way 
to introduce for linear codes over an arbitrary finite module the main concepts 
of coding theory over fields so that its fundamental results remain valid. For 
instance, the parity check matrix and the dual code are defined over the recip- 
rocal module (having the same number of elements as the given one) and the 
MacWiliams identities for complete weight enumerators remain true. 

We apologize for the brevity of our further comments caused by the space 
limits which were rather severe for a paper of such kind. Due to the same cause, 
the reference list is not exhaustive. We only tried to cite all the specialists that 
worked in the area during the last several years. We apologize in advance to 
the authors whose names may be missed: please consider it not as a malicious 
intention but as the sign of our insufficient competence. More complete reference 
lists could be found in e.g. [33,36,45,60]. 

We consider the following topics which are comparatively new: 

1. General theory of linear codes over finite modules. 

2. The comparison of linear codes over finite modules, linear spaces and fields. 

3. Weight-functions on finite modules as a generalization of Hamming weight. 

4. Presentations of codes over fields by linear codes over modules. 

5. General theory of polylinear recurrences over finite modules. 

6. Presentations of linear codes over modules by polylinear recurrences. 

7. Recursive and linear recursive MDS-codes. 

8. Periodic properties of polylinear recurrences over finite modules. 

9. Pseudorandom sequences generated by polylinear recurrences. 

1 General Theory of Linear Codes over Modules 

Let i? be a finite commutative ring with identity e, and rM a finite faithful 
module. Any submodule K. < rM'^ is called a linear n-code over rM, its Ham- 
ming distance defined as d(/C) = min{||a|| : a G IC \ 0 }, where ||q:|| is Hamming 
weight of a. 

We choose as the main criterion of correctness for the theory of such codes 
the presence of notions of the parity-check matrix and the code IC° dual to the 
given code 1C, defined in such a way that in particular the equality K.°° = K. and 
the MacWilliams identity for complete weight enumerators of codes JC and JC° 
should be valid. 

Gonsider the following fundamental example. Suppose we study linear codes 
L < rR^ over a ring R, and we define the dual code £° in the usual way 
as £° = {/3 G i?" : (3C = 0}. Then L°° D £, but the equality L°° = £ is 
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guaranteed if and only if i? is a quasi-Frobenius ring [52,54] (for example principal 
ideal rings and in particular R = are quasi-Frobenius). In such cases the 
codes over R can be studied without codes over modules, as it was done in 
[3,4,7,27,69,71,72]. On the other hand, if the ring R is not quasi-Frobenius then 
for deriving deep enough results the dual code should be built not over R but 
over the corresponding quasi-Frobenius module rQ (see below) that has the 
same number of elements as R. In its turn, the code dual to a linear code over 
is built over the ring R. 

In the most general case, while studying the linear codes over arbitrary finite 
module rM, to obtain the results close enough to that of the theory of linear 
codes over fields, the code dual to the given one must be defined over the module 
rM*, which is Morita dual to rM. Now we pass to the exact statements. 

1.1. Quasi-Ftobenius Modules and the Reciprocal Modules. For any 
ideal I <i R and any submodule K < rM we define their annihilators, corre- 
spondingly in M and R, by the equalities 

AnnM(.f) = {a G M : la = 0} < rM; Ann/j(iC) = {r G R : rK = 0} <i i?. 

A module rM is called quasi-Frobenius {a QF-module), if Annij(AnnM(.I)) = I 
and AnnM(Annij(AT)) = K for all I<R and K < rM. For any finite commutative 
ring R there exists a unique (up to isomorphism) QF-module rQ [18,52]. It 
might be described as the character group Q = Hom(F, Q/Z) of the group 
(i?, -I-), where the product rw € Q of an element w € Q by an element r G R is 
defined by the condition ruj{a) = uj{ra) for any a G R. We have (Q, -I-) = {R, -I-) 
and IQI = |i?|. A ring R is called quasi-Frobenius, if rR is a QF-module. 

Now, as we continue to discuss the example given in the Introduction, define 
the product of the row a = (oi, . . . , a„) G i?" by the row a. = {a\, . . . , a„) € Q" 
as act = oioi a„a„ G Q, and say the the code = {a G Q" : Ca = 0} 

over rQ is dual to the linear code C < R^, while the code /C° = {a G i?" : 
a£ = 0} over R is dual to the linear code K. < rQ^. Then, in particular, we have 
£0° = £, /C°o = JC,\C\\C°\ = \JC\\JC°\ = |F|" = |Q|” [52,54]. This construction 
is generalized to the linear codes over an arbitrary finite module rM using the 
following concept. 

We call the module rM* = Hom/{(M, Q) of all homomorphisms rM 
rQ reciprocal to rM (or Morita-dual to rM). It may be presented also as 
Hom(M, Q/Z) in the following way. Let M = (ai)-i-...-j-(at) be a direct sum 
of cyclic groups. Then any ip G M* generates t characters p{ai) = Ui G Q = 
Hom(F, Q/Z), i G l,t, and unique character ip G Hom(M, Q/Z) such that 
ip(ai) = oji(e), i G l,t. The correspondence p ^ p induces an isomorphism of R- 
modules M* Hom(M, Q/Z). It is important to note that {M* ,-\-) = {M,-\-), 
so in particular M and M* are equivalent alphabets. 

Let us define the product of a G M by G M* as pa = p{a) G Q. Then for 
a fixed a G M the correspondence p —>■ pa induces a homomorphism rM* Q 
belonging to M** = Hom/j(M*, Q). We identify this homomorphism with a, 
obtaining equality M** = M. Note that if rM = rQ is a QF-module over R 
then the i?-module M* = Q* = F[omR{Q,Q) is isomorphic to R [18,52]. In 
particular if i? is a QF-ring and M = R, then there are natural identifications: 
Q = Rwd M* = M = R. 
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1.2. The Dual Code and the Parity-Check Matrix [55]. Let now JC <r 

M” be a linear code over any finite module rM. To define the dual code we 
note that any element of the reciprocal module (M")* = Homij(M",Q) might 
be considered as the row tp = (i^i, (p„) G (M*)” = Homi^(M, Q)”, acting on 
elements a = (ai, o;„) G M" by the rule p>{ot) = = (piai + ...+LpnC(n G Q- 

Then (M")* = (M*)". Now if /C < _rM" then the code /C° < r{M*)^ as 
/C° = Ann(M*)'*(^) = {¥> G (M*)”) : iplC = 0} is called the dual code to 
the code K.. Our above conventions give the inclusions IC°° < (M")** = M", 
K. C IC°°. If we consider 1C only as a subgroup of (M”, +), then /C° is the dual 
to K. code defined in [14]. But our construction allows additionally to study JC° 
as an i?-module if /C is a submodule of _rM". Referring to [18,14] we have 

Proposition 1 There is a group isomorphism IC° = and \IC°\ • |/C| = 

|M|”, = /C. 

Let (fii = {(fii, G (M*)”, i G 1,^, be a generating system of the module 

rIC°. Let us call the matrix <P = {ipij)ixn over M* a parity-check matrix of the 
code 1C. It may be considered as a homomorphism : rM^ rQ^^'> (here 
rQ^^'^ is the module of all Lcolumns over Q) acting on a G M" by the rule 
= {ifiOL, . Any column = {ipij, of the matrix <P is 

a homomorphism tpj : rM rQ^’’^ ■ Let us define the guaranteed rank KM{d>) 
of the matrix T> relatively to M as the maximal fc G N such that any system 
of k columns of <P is linearly independent over M, i.e. + ... + 

yf 0 for any {a\, ...,ak) G \ 0. As for codes over fields we have 
Proposition 2 /C = Ker<? and d{JC) = km{^) + 1- 

If a code K. < rM'^ has a parity-check matrix over the ring R, i.e. such a 
matrix <Pixn over R that K. = {a G M" : <Pcx^ = 0}, then it is called R-closed. 
Proposition 2 is true for such matrices. Note that all linear codes over the QF- 
module rQ are i?-closed as well as all linear codes over QF-ring R [50,51,52]. 

1.3. The MacWilliams Identity. Let M = {/ii = 0, /X 2 , ..., /Xm} and M* = 
{/X* = Q, p, 2 T"i Tm}- The complete weight enumerators of the codes K. < rM'^ 
and JC° < rM*'^ are polynomials of m variables over Z 

WK{Xfi : pC M) = ^ Xai...Xa„, WK‘>{y)i‘ : p* G M*) = ^ 

ol^K ipGK-^ 

(1.1) 



According to [52] there exists a distinguishing character x • (Q: +) ^ (C*,-) of 
the module rQ, i.e. such a character that x(A") yf 1 for every nonzero submodule 
K < rQ. The following theorem may be considered as an extension of results of 
[14] for Hamming weight enumerators of dual codes over a finite abelian group 
to complete weight enumerators of dual codes over modules. 

Theorem 3 ([55]) There is the MacWilliams identity 



Wjco{y) =-^HA:(Mi(y), 
|/C| 



m 

Tm{y)), where ^«(y) = ^ x(^J'/is)j/^. , s G Rm. 
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The (Hamming) weight enumerator of a code 1C < M" satisfies the equalities 

= WKi,x,y,...,y), W^o{x,y) = ^^W^{x + {m - l)y,x - y). 

These results generalize the preceding results of [50,51,52] (for the case when 
rM is a QF-module) and results of [27,7] (for the case M = R = Zm)- The 
last theorem implies also the result of [77] about MacWilliams identity for linear 
codes over a finite (in general noncomutative) Frobenius ring since the latter 
always can be presented as a module over a suitable commutative QF-subring. 
A slightly different approach to the concept of duality was proposed in [17]. 

2 Comparison of Codes over Fields and Modules 

Here the study of linear codes over a module rM is reduced to the case when 
the ring R is local. In this case it is possible to build linear codes over rM 
which inherit the properties of linear codes over the residue field R. But the 
possibilities to build the linear codes with better parameters, than those of the 
linear codes over the field of |M| elements, are limited in some sense: every such 
code is majored (cf. 2.2) by a linear code over the space L of cardinality |M|. 
2.1. Reduction to Local Rings. A finite commutative ring R is called local 
if its nilradical = iR(i?) (the set of all nilpotent elements) is a maximal ideal 
of R. Then 01 is the unique maximal ideal of R. Any ring we consider has 
decomposition into a direct sum of local subrings: R = Ri+...+Rt- If e = ei + 
... + Ct, where G Rg, then is the identity of Rg and Rg = CgR, s € I, t. 
The module M and the code 1C < rM'^ have corresponding decompositions: 
M = Mi+...+Mt, JC = JCi+...+JCt, where Mg = CgM is an H^-module and 
ICg = CglC is a linear n-code over Mg. 

Proposition 4 ([60,55]) d(/C) =min{d(/Ci), . . .,d(/Ct)}. 

Let now i? be a local ring with the nilradical 01. Then R = i?/01 is a field 
of elements r = r + 01, r G i?. The socle &{M) of the module rM (of the 
code 1C < rMC^) is defined as the sum of all its irreducible submodules, and 
6(M) = AnnM(Ol) (S(/C) = Annyc(Ol)) (see p. 1.1). We may consider &{M) 
as a space over the field R, where ra = ra for all r G .R and a G &{M). 

Proposition 5 ([60,55]) Let R be a local ring and 1C < rM"' . Then S(/C) is 
a linear n-code over the space r&{M) and d{lC) = d(6(/C)). 

It allows to “turn linear codes over fields into linear codes over modules”. We 
say that /C C M” is an [n, k, d]-code over M if |/C| = |M|^, d(/C) = d. 

Proposition 6 ([55]) Let R be a local ring. Lf there exists a linear [n, k, d]-code 
C over the field R, then there exists a linear systematic R-closed [n,k,d]-code 
1C over rM. Lf (n, |R|) = 1 and L is a cyclic code, then 1C can be chosen to be 
cyclic also. 
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This result generalizes some particular results of papers [3,4,69,71,72]. 

2.2. Linear Codes over Fields, Spaces and Modnles. Let L be an ele- 
mentary abelian p-group of order q = p*, i.e. a finite linear space over GF{p). 
If t > 1 then there exist linear codes over L which are better than linear codes 
over GF{q). Let BL{n, 3) {Bq{n, 3)) be the maximum of the cardinalities of linear 
n-codes over (L,+) (over GF{q)) with the distance 3. 

Proposition 7 ([60]) If < n < — {p^ — 1) for some k > 2 and 

6 G l,t — 1, then BL(n, 3) = p*~^Bq(ji, 3). 

It is a generalization of earlier result of [25] for some cases when p = 2. The 
attempts to build linear codes over modules which are better than linear codes 
over linear spaces unexpectedly failed. We say that a n-code K. over M is majored 
by a n-code £ over some alphabet L if \L\ = \M\, ]£] > \JC\ and d{£) > d{JC). 

Theorem 8 ([55]) Let rM he a module over a local ring R, and let pL be a 
linear space of cardinality \L\ = \M\ over the residue field R = R/2tl. Then any 
linear code K. < rM'^ is majored by some linear code £ < rL^ . If M is a finite 
abelian group and L is a direct sum of elementary abelian groups of cardinality 
\L\ = \M\, then any linear n-code over M is majored by some linear n-code over 
L. 

See also p. 7.2 below about the Asturian code. 

3 Weight Functions on Finite Modules 

The investigation of linear codes over modules is not so important for construc- 
tion of codes which are better than codes over fields as for description of new 
linear representations of these codes. In [46,47] it was discovered that certain non- 
linear binary codes (Kerdock codes) can be represented as linear codes over Z 4 . 
Later in [20] a variant of this representation (the so called Gray map) was found 
which gives an isometry between the Lee-metric space Z" and the Hamming- 
metric space F 2 " . Meanwhile in [37,38,56] a generalized Kerdock code over 
Fq, q = 2\ was built as a representation of some linear code over the Galois 
ring GR{q^,4). Independently, in [8,9] the so-called homogeneous weight on the 
residue class ring Zm was introduced, and the resulting metric was characterized 
by algebraic-information-theoretic properties. With a suitable generalization of 
this weight to the ring GR{q^,p^) the Reed-Solomon map [56] isometrically 
embeds GR{q‘^ ,p^)^ into the space (F^«, WHam)- So the notion of homogeneous 
weight is closely related with the representations of codes over fields by linear 
codes over modules. 

Here we present the results of [22,24], where general notion of homogeneous 
weight on finite modules M over arbitrary finite (possibly noncommutative) rings 
R was introduced and investigated. 

A function ic : M ^ K. is called a weight if 
(Wl) Vx e M : w{x) > 0, w{x) = 0 x = 0; 

(W2) Vx G M: w\x) = w{—x); 

(W3) Vx, y G M: w{x -\- y) < w{x) -\- w{y). 
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For any weight w: M the function pw{x, y) = w{x—y) defines a translation- 
invariant metric on M, and every translation-invariant metric p on M arises in 
this way from the weight Wp{x) = p{x, 0). 

We call a function w: M egalitarian, if 

(HI) there exists C G such that = C ' for nonzero 

submodule U < M. 

This function is called homogeneous if in addition 

(H2) Vx € M,yu € R* : w{x) = w{ux). 

A module rM is called weighted if it admits an egalitarian weight w. In 
this case it admits also a homogeneous weight: w*{x) = w{ux). 

Note that Hamming weight w = WHam on rM is homogeneous if and only if 
the module rM is simple. In [9] the following motivation for introducing the 
homogeneity axiom (HI) was given. For an arbitrary weight w on rM and 
n G N the weight w”: ^ R defined by ru”(x) = w{xi) -I- • • • -I- w{xn) for 

X = {x\, . . . ,Xn) G AT" turns M” into a translation-invariant metric space. Let 
now /C be a linear code over rM, i. e. a submodule of rM^. Then the projection 
ICi of K, onto the i-th coordinate is a submodule of rM. For information-theoretic 
purposes it is natural to require that /Ci yf 0 for every i (i. e., /C is a full-length 
code) and that the numbers Wi = satisfy the condition Wi = W 2 = 

• • • = Wn The second condition holds for full-length linear codes over fields. It is 
satisfied by every full-length linear code over rM if and only if w satisfies (HI). 

We give the full description of weighted modules. Let TI(i?) be the nilradical 
of the ring R. The quotient ring R = i?/fTt(i?) has an Artin-Wedderburn-Molien 
decomposition 



(3.1) 

where each ideal Rj is a simple subring of R. Hence there exist positive integers 
mj and prime powers qj (j G l,t) such that Rj is isomorphic to the matrix ring 
The socle &{rM) of rM (p. 2.1), is an i?-module, and 

6(M) =S'i©S' 2 ®---©S't, Sj = RjM. (3.2) 



Theorem 9 A module rM is weighted if and only if&{M) is a cyclic R-module 
(i.e. Uj < mj for j G l,t) and the decomposition (3.2) does not contain F2 ©F2 
or F2 © F3 as a direct summand. 



Corollary 10 (a) A finite abelian group of order m is weighted if and only if it 
is cyclic and m ^ 0( mod 6). 

(b) A faithful module rM over a finite commutative local ring is weighted if 
and only if it is a QF-module. 

The Corollary 10(a) implies that the Constantinescu — Heise criterion [9] is true 
not only for cyclic groups but for all finite abelian groups. 

A finite ring R is called a Frobenius ring if rR= &{rR) and Rr = 6{Rr). 




372 



V.L. Kurakin et al. 



Theorem 11 For a finite ring R both modules rR and Rr are weighted if and 
only if R is a Frobenius ring and the decomposition (3.1) does not contain F 20 F 2 
or F 2 0 F 3 . In this case the left and right homogeneous weights on R coincide. 

We denote by Tr the class of all finite i?-modules and define the Euler 
function r: Tr ^ N as r{M) = \{x G M | M = Rx}\ and the Mobius function 
FR' ^ Z by the recursion formulae: p.r{0) = 1 , Fr{U) = 0 if M G 

Tr\0. 

Theorem 12 For a weighted module rM there exists the unique homogeneous 
weight w = Wh{x) such that = \U\ for any nonzero submodule 

U < M . This weight has the form 

Wh{x) = 1 — } for all X G M. (3-3) 

r(Rx) 

4 Scaled Isometries and Presentation of Codes 

We describe here a general enough technique, based on the concept of scaled 
isometry, providing construction of presentations of linear codes over weighted 
modules, and give some examples of efficient applications of this technique. For 
a weighted module rM G Tr we fix some egalitarian weight wr. It is extended 
to M" ^ R by setting w^(x) = X^r=i generates a metric 
Pfl(x,y) = - y) on M”. 

Let now rN be another weighted module over some ring S with an egalitarian 
weight ws ■ Suppose that for some d G N and C G \ 0 there exists a mapping 
a: M ^ satisfying 

\/a,beM: pg{a{a),a{b)) = C ■ pR{a,b). (4.1) 

We call a an isometry with scale factor C or, for short, a scaled isometry. It 
induces for every n G N a scaled isometry ct": (M",p^) ^ (A^'^",pg‘^) with the 
same scale factor. With every code C C M" we associate the code C = o'"‘{C) C 
and call C a a -representation of the code C. Note that if C is distance 
invariant (relative to the metric pff) then such is C (relative to the metric p'g^). 
If (7 is a linear code over rM, we call C' a a-linear code (and sometimes briefly 
an rM - linear code). An /{M-linear code C is distance invariant but may be 
nonlinear. This approach allows to construct some new good codes and to find 
new compact representations of some well-known codes. 

In [20] an isometry between (Z 4 ,pz 4 ) and (F|,pHam) was rediscovered (the 
so-called Gray mapping Z 4 ^ F 2 , 0 i-^- 00, 1 i-^- 01, 2 i-^- 11, 3 i-^- 10), 
and the term Zi-linear code was introduced for what we call a ^-linear code. 
This approach allows to repeat the proof of Z 4 -linearity of binary Kerdock code 
[46,47] and to prove the Z 4 -linearity of Preparata, Delsarte-Goethals and some 
other codes. The more general form of this mapping for Galois rings in [56,34] 
gives 

4.1. A Generalized Kerdock Code over ¥q,q = 2b Let R = Gi?(q'^,4) be 
the Galois ring of characteristic 4 and cardinality q^, q = 2\ I >1. A generalized 
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Kerdock code Kq{m + 1) over Fg {m is odd) is a Reed-Solomon presentation of 
the so called base linear code over the ring R. 

Let S = be an extension of degree m of the ring R. The set 

r{S) = {P G S: /3* = P} is closed under multiplication and consists of g"* 

elements. Any element P € S can be written uniquely as the sum P = Pq + 2 Pi, 
where Pt = jiiP) G A(S'), t = 0, 1. If we define 0 on R{S) by the rule u(Bv = 
70 (^ 0 ^) then is F^m and r{R) = {P £ R: /3^ = P} is the subfield 

Fg of r{S). Let 0 be a primitive element of the field R{S). The base code K.R{m) 
is defined as a linear code of length h = over R consisting of all words 
V = (w(0) . . . v{h — 1)) such that for some ^ £ S, c £ R 

v{i) = + c, i = 0,h — 2, v{h—l) = c, (4-2) 

where Trf^{x) is the trace-function from S onto R: Trf^{x) = X)cr'^(2^)> spans 
the group of automorphisms of S over R. 

Let now r{R) = {wq = 0, = e,... ,ojq-i}. Define 7 * : i? ^ F{RY on 

elements r = ro02n £ R as 7 *(r) = (n, ri 0 wiro, . . . , ri0Wg_iro). Then 7 *(i?) 
is a Reed-Solomon [q,2,q — l]g-code over F^ and therefore the mapping 7 *(i?) 
is called the RS-mapping [56]. It is easy to see that 7 *(i?) is a scaled isometry 
of the space R with the homogeneous weight into the Hamming space r{R)‘^. 
The code Kq{m -£ 1) is a concatenation of the code /C_R(m) (linear over R) and 
the code 7 *(i?) (linear over r{R)). It is the code of length n = consisting 

of all words 7 *(m) = ( 7 *(u( 0 )), . . . , 7 *(u(/i — 1)), u £ K.R{m). If g = 2, i.e. 
R = Z 4 , this code is the original binary Kerdock code [1-5,7,15]. 



Theorem 13 ([34]) Let m = 2A 0 1 > 3. Then the code Kq{m 0 1) is 
an R-linear (n, n^,^^(n — \/n))q-code with the complete weight enumerator 
■■■'! Xq—l) — 



9-1 



9-1 



9-1 



9-1 



^ x] 0 (g™+2 - g) n + 2 n E 

i=o j=o j=o j=o 



9-1 



9-1 



j=0 j=0 



^3" 



4.2. Presentations of the Extended Binary Golay Code [24,22]. It can 

be presented as a linear code over the ring i? = F 2 0 F 4 . Note that smaller rings 
of such form (F 2 0 F 2 ,F 2 0 F 3 ) are not weighted. 



Proposition 14 Let e = ei 0 62 6e the corresponding decomposition of the 
identity of R, and F 4 = F 2 (a). Let cr: i? — > F| be the F 2 -linear map defined 
by Cl 111, 62 1 -^ 110, a Oil and 1C < rM^ be the R-linear code with 
parity- check matrix 

ei 62 e e e 0 0 0 
6 6i 62 e 0 e 0 0 
6 e 6i 62 0 0 e 0 
62 e e 6i 0 0 0 e 



(4.3) 
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Then the mapping a is a scaled isometry from (R,pr) onto (FiiPnam) ^^6 
scale factor and the code <J^{IC) is a linear binary (Golay) [24, 12, 8] -corfe. 

Another isometric representation of the same code is based on some egalitar- 
ian, but not homogeneous weight. Let R = F2[x]/(a;^) = F2[z], where z = x+{x'^) 
is the image of x in F2[x]/(cc'^). Every a G R has the unique representation 
a = oo -I- ai^ -I- tt2Z^ + asz^ with aj G F2. Define r: R ^ ¥2 and w: R ^ Rhy 
setting r(a) = (oq -I- a 1 -I- 02 -I- 03, oi -I- as, 02 -I- as, as) and w(a) = WHam(T(a)). 
The function w is an egalitarian weight on rR. 

Proposition 15 Let 1C < rR^ be the linear code with parity-check matrix 



1 0 0 V z z 
QIQ zv z 
0 0 1 z z V 



(4.4) 



where v =l-\- z^. The code r®(/C) is the linear (Golay) [24, 12,8]-code over F2. 

4.3. Presentation of Ternary Golay Code [22]. Let R = F3[x]/(a;^) = 
F3 [z] with z = X (x^). Then any a G R has the unique representation a = 
oq -I- aiz a 2 z’^ with aj G F3. Let now ct: i? ^ Fg be defined by u{a) = 
(ao — ai -I- 02,01 -I- 02,02). Then w: R ^ R, defined by w(a) = > i® 

an egalitarian weight on rR. 



Proposition 16 Let K. < rR'^ be the linear code with parity-check matrix 



1 0 V 
0 1 —V 



(4.5) 



where v = l-\- z'^. The code o^(/C) is a linear (Golay) [12, 6, 6] -code over F3. 

4.4. Scaled Isometry over a Commutative QF-Ring [24,22]. Let now 

i? be a finite commutative ring with the unique minimal ideal S, i. e. i? is a 
local quasi-Frobenius ring with soc R = S. We construct a scaled isometry from 
the weighted i?-module rR into a suitable Hamming space F”. Let J = radi? 
and R = R/J = F^. The set T{R) = {o G i? | o* = 0} is a full system of 
representatives for R, and thus has a natural field structure (T(i?),0, •). There 
exists a system of elements ttq, . . . , tt; G R such that tt[ is a generator of S and 
every x G R has the unique representation 

X = ooTTo -I- • • • -I- OiTT; with Oi G T{R) for i G 0,1. (4.6) 

We fix such system (ttq, . . . , tt;) and define the functions 7i : R ^ r{R) hy setting 
X = X)i=o li{^)'^i- Consider {{I -I- 1) x <7*)-matrix G{1, q) with entries from r{R) 
whose columns are the vectors (oq, . . . , a;_i, 1), (oq, . . . , oj-i) G T{R)\ in some 
fixed order. The 9-ary linear [q\l -\- l]-code with generator matrix G{1, q) is the 
generalized Reed-Muller code C = GRM{l,l,q), cf. [21, Ch. 9.5]. It is a two- 
weight code with nonzero weights q^ — q^~^ and qK 
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Proposition 17 The mapping a: defined by 

= ( 7 o(x), 7 i(a;),..., 7 /(a;)) -0(1, q), (4.7) 

is a scaled isometry with scale factor — q^~^ from {R,pu) onto a generalized 
Reed-Muller [q^ , I, q^ — q'^~^]-code GRM{1, 1, q), over the field (P(i?), 0, •) . 

Some particular cases of this result can be found in [33,56,9]. For a gen- 
eralization to arbitrary finite commutative local rings (using the notion of a 
quasi-Frobenius module) we refer to [24]. About linearly representable codes 
over chain rings see also [23] . 

We have also to mention here the works [70,78] concerning the extendibility 
of code isometries. 



5 General Theory of Polylinear Recurring Sequences 

As it was already pointed out in the Introduction, the theory of (poly)linear 
recurrences over modules was actively developing recently due to various appli- 
cations, in particular, in coding theory. It is worth noting that, as in previous 
sections, quasi-Frobenius modules play special role. Here we use the results of 
[33,36] and of works cited therein, in particular [50,54,61,65,66,67,68,74,75,79]. 
5.1. Main Concepts. For fc G N, we call any function /r: Ng ^ M a k-sequence 
over a module rM. We write p = p{z), where z = (zi, . . . ,Zk) is the row of 
free variables over Ng. The set of all fc-sequences over M is an i?-module 
relative to the usual operations on functions. Let Vk = R[x\, . . . ,Xk\ = i?[x] 
be the polynomial ring of k variables. For any s = (si,... ,Sk) G denote 
the monomial x^^ ■■■^k‘ Then any F(x) G Vk has the form F(x) = 

/sX®. We define the structure of a Pfc-module on by A(x)/x = iz, v £ 
:^(z) =X]s/sM(z + s). 

An ideal I of Vk is called monic if there exist monic polynomials 
Fi(x), . . . , Fk{x) G R[x\ (of one variable) such that 



Fi{xx),... ,Fk{xk) £ F (5.1) 

If / is a monic ideal, then the quotient ring Vk/I is finite, and vice versa. The 
annihilator Ann(A4) = Ann73j,(A4) = {A(x) G Vk '■ F(x)M = 0} of any subset 
A4 C in the ring Vk is an ideal of Vk- We say that a sequence p £ M^^'> 
is a k-linear recurring sequence (/c-LRS) over a module M if I = Ann(/i) is a 
monic ideal. In this case, polynomials (5.1) are called elementary characteristic 
polynomials of the fc-LRS p. 

Proposition 18 The set of all fc-LRS p £ is a submodule of the 

Vk-module M^^'> . For any subset I C Vk the set Lm{I) = {p £ Ip = 0} 

is also a submodule of this module. Moreover, Lm{I) C if and only if 
is a monic ideal ofVk- 
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For a monic ideal I <\Vk the set Lm{I) is called a k-LRS-family over rM. 
Since I ■ Lm{I) = 0 the T^fc-module Lm{I) may be considered as a module over 
the ring S = Vk/I = R[0i , ... ,0k], where Og = Xg + 1- This ring will be called 
the operator ring of the ideal I (of the family Lm{I))- If / = Ann(/i) for some 
pL G then S is said to be the operator ring of the k-sequence p,. 

5.2. Relations between LRS-Families and their Annihilators. The fol- 
lowing relations between 1-LRS families over a field P are well-known. For any 
monic polynomials F{x), G{x) G V = P[x] 

Lp{F) + Lp{G) = Lp{[F, G]); Lp{F) n Lp{G) = Lp{{F, G)). 

Any 1-LRS family At over the field P has the form A4 = Lp(F) for some monic 
polynomial F(x) G V and is a cyclic 7^-module: M. = Ve^ . For any u,v G LP^^'> 

V G Vu My{x)\Mu{x), 

where Mu{x) is the minimal (characteristic) polynomial of the LRS u. Any monic 
ideal I CP is the annihilator of some LRS over P. The attempts to generalized 
these results on fc-LRS families over finite module pM gives the following results. 

Let = %k{R) be the set of all monic ideals of the ring Vk = R[xi , ... ,Xk] 
and let iXflk = Tlk{M) be the set of all finite T^fc-submodules of the module . 
In this case any element A4 G Tlk is a submodule of and we have a pair 

of the Galois correspondences 

Ann : OJlfc ^ 2tfc, Lm : 2lfc ^ IHfc. (5.2) 

This means that for any Ad G / S 21^ we get 

AI C LM(Ann(Ad)), I C Ann(LM(d)). (5.3) 

These inclusions are strict in general. Moreover we may state that 

Ann(Adi + M 2 ) = Ann(AIi) n Ann(Ad 2 ); Lm{Ii + h) = Lm{Ii) H Lm{I 2 )', 



Ann(AIinAd 2 ) A Ann(Adi)-|-Ann(Ad 2 ); LM{lMh)PLM{h) + LM{h)', (5.4) 

for any ideals I\, I 2 G %k and modules Adi, AI 2 € The inclusions (5.4) are 
also strict in general, but in particular, we get 

Proposition 19 Let I\, I 2 he comaximal ideals of Vk- Then Lm{Ii I 2 ) = 

Lm{I\) + Lm{I2) (a direct sum), and it is a cyclic Vk-module if and only if the 
modules Lm(Is), s = 1,2, are cyclic. 

If M = i? is a field and k = 1 then the correspondences (5.2) are bijections, 
and the inclusions (5.3), (5.4) are equalities. The finite modules satisfying these 
conditions (i.e. admitting the theory of LRS analogous to the theory of LRS over 
a field) are described as in section 1. 
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Theorem 20 ([50,52,54]) The following conditions are equivalent: 

(a) rM is a QF-module; 

(b) the Galois correspondences (5.2) are bijective; 

(c) for any monic ideal I C Vk the family Lm{I) is a QF-module over the 
ring S = Vk/l, o.nd any module A4 G DJlk is a QF-module over the ring S = 
Vk/ Ann{M); 

(d) the inclusions (5.)) are equalities; 

(e) for any recurrences p,, v G we have 

V G VkP Ann(^) C Ann(:^). 

The following essential supplement to the last Theorem shows an interesting 
connection between cyclic LRS-families and QF-rings. 



Theorem 21 ([50,52,54]) Let rQ he a quasi- Frobenius module. Then for any 
monic ideal I QVk the following conditions are equivalent: 

(a) / = Ann(/x) for some recurrence p G ; 

(b) Ai = Lq{I) is a cyclic Vk-module; 

(c) S = Vk/ 1 is a quasi- Frobenius ring. 



Corollary 22 Let Fi{xi), . . . , Fk{xk) be monic polynomials from Vk and 
S = Vk/{Fi{xi), . . . ,Fk{xk)). Then S is a QF-ring if and only if R is a QF-ring. 

Theorem 20(c) allows us to build the QF-module over any finite commutative 
ring S' as a /c-LRS family over some principal ideal ring. Really the ring S is 
an extension S = ... ,7Tfc] of a principal ideal subring R < S [29]. Then 

S = Vk/ 1 for some monic ideal I < R[xi , ... ,Xk]. Since i? is a QF-ring, Theorem 
20(c) implies that Lr{I) is the required QF-module over S. 

5.3. The Berlekamp — Massey Algorithm. Let rM be a finite left module 
over a finite (not necessary commutative) ring R. We say that a polynomial 
F{x) = X® — Cs-ix®“^ — ... — cix — cqG R[x] generates (from the left) a sequence 
■u(0, ^ — 1) = ('u(O), ■u(l), ..., u{l — 1)) G M* of length I if s > I or s < I and 



u{i -I- s) = Cs-iu{i -I- s — 1) -I- ... -I- ciu{i -I- 1) -I- cou(i), i G 0, 1 — s — 1. 



A monic polynomial of the smallest degree which generates u(0, ^ — 1) is called its 
(left) minimal polynomial. The Berlekamp — Massey algorithm finds a minimal 
polynomial of a sequence of lengths I with complexity 0{l^) operations of R 
and M. For the first time the Berlekamp — Massey algorithm was described for 
sequences over fields in [1] and [42] . Since then many versions of this algorithm 
over fields and rings were proposed (see, e.g., [19,64,68], and bibliography in 
[32]). In the presented general form the Berlekamp — Massey algorithm was built 
in [32]. 
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6 Presentations of Linear Codes by Polylinear 
Recnrrences 

Some linear codes over a finite module rM (and all linear codes over any QF- 
module rQ !) may be described in terms of polylinear recurrences over rM. Any 
finite subset 

( 6 . 1 ) 

is called a polyhedron. Let be the i?- module of all functions 6 : F ^ M. 
Any such function is uniquely determined by its valuation diagram 6[F] = 
(<5(ii), i5(in)) G M”. It is clear that the module rM^ is isomorphic to rM^. 
Of course for any fc-sequence /r G we may also consider the valuation 

diagram p\F] = (/r(ii), /x(iAr)). Let now J be a monic ideal of Vk and 

K. = L^{I) = {yi[F] : G Lm{I)}- (6.2) 

Evidently, /C is a submodule of the i?-module rM^ and by the indexing (6.1) 
we may consider /C as a submodule of rM'^ . Thus /C is a linear n-code over rM. 
In general not every linear code over rM has the form (6.2), but at the same 
time we have 

Proposition 23 [60]. Let rQ he a QF-module. Then for any linear code 1C < 
rQ" there exist a parameter k G l,n, a polyhedron IF C of cardinality n and 
a monic ideal I <\Vk such that 

JC = L^{I). (6.3) 

It is interesting to find the smallest k such that the code 1C < rQ" has the 
representation (6.3) with the additional condition: IF is a Ferrers diagram, i.e. 
if i G IF, j G N§ and j < i (in each coordinate) then ^ G F. 

Let us call a code fC < rM^ recursive of dimension k (or k-dim recursive), 
if it has a presentation (6.2) for some ideal I <Vk, Ferrers diagram IF C and 
the ordering (6.1) of their elements. The minimal k with this property will be 
called the recursive dimension of the code 1C. First important class of recursive 
codes gives 

Theorem 24 (Asturian Theorem [60].) Let K. < rM'^ he a systematic linear 
code of the rank m with a parity-check matrix 







■ /£li -e 0 . . 


. 0 ^ 




H = 


f(2) 

Jo • • 


• /i'll 0 -e . . 


. 0 


, k 




■ ■ 


• /i"li 0 0.. 


• -ey 


kxn 



Then 1C is a k-dim recursive code satisfying the equality (6.2), where L <Vk is 
the ideal generated hy the polynomials Fi{xi) = x'f' — fm-i^T~^ — ... — 
F 2 {xi,X 2 ) = X 2 - f^liXf^~^ - . . . - fQ'^\ ■ ■ ■ ,Fk{xi,Xk) = Xk~ 
f^}_ix[['~^ — . . . — /o*\' and IF C is Ferrers-diagram of the form F = 

{0,ei,2ei,... ,mei,e 2 ,... ,ei). 
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7 Recursive and Linear Recursive MDS Codes 

The connection between MDS-codes, Latin squares and quasigroups is well- 
known. We present here the results of [10,11,12] where these objects are studied 
under the additional practically useful condition of recursivity for the corre- 
sponding code. 

7.1. Quality Parameters of Recursive Codes. A code K. C 17" in an 

alphabet 17 of g elements is called k-recursive, 1 < fc < n, if there exists 
a function / : 17* ^ 17 such that JC = /C(n, /) consists of all the rows 
■u(0, n — 1) = (m(0), . . . , u{n — 1)) G 17" with the property 

u{i + k) = + k — 1)), iG0,n — /c — 1. 

In other words, JC is the set of all output n-sequences of a feedback shift register 
with a feedback function /. In [10] the MDS-codes of such type, i.e. recursive 
[n, fc, n — k + l]q-codes are investigated, and the following parameters are con- 
sidered: 

n(fc, q) - the maximum of lengths of MDS codes 1C of (combinatorial) dimension 
k (|/C| = g*) in alphabet 17 of cardinality q; 
rC{k, q) - the maximum of lengths of fc-recursive MDS codes of the same type; 
/(fc, q) - the maximum of lengths of MDS codes 1C of (combinatorial) dimension 
fc which are linear over an Abelian group (17,-|-) for some operation -|- (i.e. 
/C is a subgroup of the Abelian group 17" and \JC\ = g*, where g = |17| and 
n G N). We also call such codes linear in wide sense. 
l'"{k, q) - the analog of l{k, q) for recursive codes. 

For the primary (the power of a prime) numbers g one can also define 

m{k, q) - the analog of /(fc, g) for the codes which are linear over the field F^. 
mC{k,q) - the analog of m{k,q) for the recursive codes. 

Moreover, we call the above function f{x) idempotent if it satisfies the identity 
f{x, ... ,x) = X (this means that all “constants” (a, . . . , a) belong to IC{n, /)). 
Thus, in addition to the above six parameters we can introduce n"{k, g), l"{k, q) 
and nrd''(k^q) (only for primary g); “ir” means “idempotent recursive”. So we 
have the following matrix of parameters 

/ [m^'’(fc, g) mC{k, q) m{k, g)] \ 

M(fc,g)= /‘'■(fc,g) /'■(fc,g) /(fc,g) 

V n"’(fc,g) n“'(fc,g) n(fc,g) / 

whose entries are not decreasing from left to right and from up to down. Nat- 
urally, the first row of this matrix (put in brackets) is present only when g is 
primary. It is interesting to estimate and to compare the entries of M(k,q) for 
various values of fc and g. In what follows the equality x^(k, q) = k means that 
the corresponding code does not exist. A standard argument gives the following 
source of estimations for the entries of M(fc, g): 
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Proposition 25 If x € {l,n}, y € {0,r, ir} and k,qi,q 2 € 2, oo then 
x^{k,qiq2) > min{xy{k,qi),x^{k,q2)}. 

7 . 2 . Results for fc = 2. It is well-known that n(2,q) = 2 + N{q) where N{q) 
is the maximal number of mutually orthogonal Latin q x g-squares. The latter 
were studied extensively (see [15,16,21]). We cite here only the following general 
conclusion: 

Theorem 26 Let g € N, g > 1. Then: 

(a) N{q) >2ifq^{2, 6}, N{q) > 3 if q ^ {2, 3, 6, 10}; 

(h) N{q) < q — 1, if q is primary, then N{q) = q — 1; 

(c) N{qiq 2 ) > min{A^(< 7 i), N{q 2 )}, in particular, if q = qi ■ . . . ■ qt is the canonical 
factorization of q then N{q) > minjqi — 1, . . . ,qt~ 1}; 

(d) N{q) > q^ — 2. 



For small values of q we have (omitting trivial cases q = 2,3): 

/355\ /566\ /788\ 

M(2,4)= 3 5 5 , M(2,5)= 5 6 6 , M(2,7)= 7 8 8 , 

\355j \566j \788j 

Proposition 27 If q is a primary number then m^{2,q) = n^(2,q) = n(2,q) = 

q+ 1; 

{ q if q is a prime; 
q — 1 if q is not a prime 

(A. Abashin (private communication)). 

Corollary 28 For a prime p and t > 1 there are no cyclic codes among ones 
permutation equivalent to a Reed-Solomon [p*,2,p* — 1]— or [p* ,p* — 2,3] — codes. 

Theorem 29 ([10]) For any q > 2, except q = 6 and possibly q G 

{14,18,26,42}, 

n-{2,q)>4. 

Really, the last inequality may be sharpened for many values of q. Some of the 
stronger estimations are easily deduced from Propositions 25,27. We call them 
standard ones. Other estimations are presented in 

Theorem 30 ([10]) The following nonstandard lower estimations are valid: 
n’’(2, q) > 8 for q = 80. 

n’’(2, q)>7 forqG {50, 57, 58, 65, 70, 78, 84, 85, 86, 92, 94, 95, 96, 97, 98}. 
n^{2, q)>6forqG {54, 62, 66, 68, 69, 74, 75, 76, 82, 87, 90, 93}. 
n^{2,q) >5 forqG {21,24,39,44,60}. 

n*’'(2, q)>7forqG {50, 57, 58, 65, 70, 78, 84, 85, 86, 92, 94, 95, 96, 97, 98}. 
n"{2, q)>bforqG {54, 62, 66, 68, 69, 74, 75, 76, 82, 87, 90, 93}. 
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7.3. Results for k > 2. The following estimations in non-recursive case are 
known. 

Theorem 31 ([43,21]) 

(a) If q < k then n{k, q) = k + 1. 

(b) If k < q and q is even then n{k, q) < q + k — 1. 

(c) If 3 < k < q and q is odd then n{k, q) < q + k — 2. 

(d) If k G and q is primary (i.e. a power of a prime) then ml{k, q) > 

(e) n(3, q) = q + 1 for primary odd q. 

(f) n(3, q) = n{q — l,q) = q + 2, for primary even q. 

The “recursive version” of these results is contained in the following 

Proposition 32 ([12]) If q < k then F{k,q) = n{k,q) = k+1 and, for primary 
q, m^{k, q) = k + 1. 



The well-known conjecture of Bush [5,43] states that m{k,q) = q + 
1 for 2 < k < q; except the case m{3,q) = m{q — l,^) = q + 2 for even q. 
For fc = 3 we have rrf{3, q) = q + 1 and PC calculations give 

Proposition 33 For every primary q € 4, 128, the number of linear recursive 
[g -f 1, 3, g — l]-codes is equal to {l/2)(p{q + l)(g — 1). 

Thus, all the codes enumerated in this Proposition may be constructed in 
some natural way from the linear cyclic codes indicated in [43, ch.ll. Theorem 
9]. One can conjecture that it is true for all primary q. 

For the case k = 3, g = 4 we have the following 

Proposition 34 to”( 3,4) = 5 < /”(3,4) = n”(3,4) = m(3,4) = n(3,4) = 6. 

One of the codes whose existence proves the preceding proposition is linear 
in wide sense with recursive function /(xi, X2, 2^3) = ax') + ax 2 + x). over F4 = 
F2(a) We call it the Asturian code. 

The existence of the Asturian code gives some important theoretical corollar- 
ies: (1) there exist linear in wide sense recursive codes that are better than any 
linear in the classical sense recursive code, and (2) for some of the best known 
linear in classical sense but non recursive codes, there exist linear in wide sense 
recursive codes with the same parameters. However, for k = 3 the Asturian code 
is an exclusive example because the PC calculations give T(3, 8) = 9 = 84-1, 
/”(3, 16) = 17 = 16 -I- 1, and the following statement is valid: 

Proposition 35 (A. Abashin, private communication) 

//t > 4 then T(3,2*) = 2* + l. 



Corollary 36 There are no cyclic codes among ones linearly (in the wide 
sense) equivalent to twice extended Reed-Solomon [2* -|- 2, 3, 2*]— or [2* -|- 2, 2* — 
1, 4] — codes. 
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8 Periodic Properties of Polylinear Recurrences over 
Finite Modules 

In the theory of LRS over Galois fields the technique of period calculation for lin- 
ear recurrences and the constructions of maximal period LRS occupy important 
place. In [33,36] these results are generalized to the k-LRS over finite rings and 
modules. Here we present only two series of results that illustrate the advance 
in this direction. 

8.1. Periodic /c-Sequences and fc-LRS [33,36]. Let fx € be a Re- 
sequence. For a fixed 1 G Ng, d G Nq\ 0 we call a 1-sequence = ^(1-t-dz) 

a regular (1, d)-extract {extract in the direction d) of the sequence /r. We say that 
the sequence /x is (1, d)-periodic {periodic in the direction d) if is a periodic 
sequence (for any 1 G Ng). A sequence /x G is called a periodic {reversible) 

sequence if any regular ( 1 , d)-extract of this sequence is a periodic (reversible) 
1-sequence. 

Proposition 37 For /x G M^^'> the following conditions are equivalent: 

(a) p, is a periodic {respectively, reversible) sequence; 

(b) the ideal Axm{p) contains a system of elementary polynomials of the form 
x^f {x*f — e), . . . , {x]f — e) {respectively, of the form x\^ — e, , x]f — e) for 
some Is G Nq, G G N, s G 1, k; 

(c) p is periodic {respectively, reversible) in each of the directions ei,. . .,6^, 
where Og is the s-th row of the identity matrix. 



Proposition 38 A k-sequence (over a finite module) is periodic if and only if 
it is a k-LRS. 

A nonzero vector t G Ng is called a vector-period of the sequence p G M^^'> if 
x*(x* — e)p = 0 for some 1 G N§. A subgroup fp(/x) < (Z^,-|-), generated by all 
vector-periods of p, will be called its group of periods. If p has no vector-periods, 
then ^{p) = 0. 

Proposition 39 The reversibility of the sequence p G is equivalent to the 

condition 

Vi G Ng 3j G Ng : xJ(xV) = p. 

If the sequence p is reversible, then for any i G Ng the sequence v = x‘/x is also 
reversible and ^{ix) = ^{p); for any t G fP(/x), we have yAp = p. 

The set 0{p) of all Rc-sequences v G of the form ix = idp, i G Ng, is 

called a trajectory of p. 

Proposition 40 A sequence p G M^^'> is periodic if and only if its trajectory 
0{p) is finite, and it is reversible if and only if 0{p) = 0{x^p) for any i G Ng. 

For a periodic sequence p the set T{p) of all reversible elements of its trajec- 
tory 0{p) is called the cycle of the sequence p, and its cardinality T{p) = |T(^)| 
is called the period of the sequence p. The set T>{p) = 0{p)\T{p) is called the set 
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of all defect elements of the trajectory 0(/i), and its cardinality -D(/i) = |2^(/i)| 
is called the defect of the sequence fj,. The sequence n is said to be degenerating 
if it is periodic and its cycle contains only the zero sequence, i.e., T(/x) = {0}. 

Thus, T>(/r) + T{g) = |0(^)|, and a periodic sequence fj, is reversible iff 
D{fj,) = 0, i.e., T(/x) = 0(/r). A periodic sequence is degenerating iff x‘ S Ann(^) 
for some i € Ng. 

Proposition 41 If jj, G M^^'> is a periodic sequence, then T{if) is the index [Z^ : 
fp(/x)] of the subgroup fp(/r) of the group (Z^,+), and |0(/i)| < |Pfc/ Ann(/i)|. 

Let us denote the P^-modules of all reversible and degenerating fc-sequences over 
rM by and respectively. 

Proposition 42 . 

The following result gives us an interesting relation between the properties 
of reversible sequences and the properties of associated rings. 

Theorem 43 Let p, G M^^'> , and let S = Vk/ Ann(/r) he the operator ring of p 
(see sec. 5), Og = Xs + Ann(/i) for s G l,k. Then the sequence p is reversible if 
and only if 9i, . . . ,6k G S* . If p is a reversible sequence, then 

T{p) = \{eu... , 0 k)\<\s*\<\s\-i, 

where ,9k) is a subgroup of the group S* generated by 9i,... ,9k- The 

equality T{p) = jA*! holds if and only if 

S* = {9,,... ,9k). (8.1) 

If p is a reversible sequence and Ann{p) n i? = 0, then T{p) = [S'! — 1 if and 
only if the following three conditions hold: 

(a) R = ¥q; 

(b) Ann(/i) is a maximal ideal of the ring Vk {i.e., S = Fg>- for some r G N); 

(c) the equality (8.1) is true. 

Let rQ be a QF-module corresponding to the ring R. A reversible sequence 
p G is called full-cycle if its annihilator I = Ann(/i) satisfies the conditions 

/ni? = 0; S* = {6i,... ,9k); Lq{I) = Sp. 



Theorem 44 Let p be a full-cycle k-recurrence over rQ, then S = Vk/ 1 is a 
QF-ring and for any v G Lq{I) = Sp 

<P(m) C ‘P(zz); T{,.)\T{py, {T{v) = T{p)) ^{vG T{u)). 

Theorem 45 Let R < S where S is a QF-ring. Then there exists a full-cycle 
LRS over the QF-module rQ with the ring of operators isomorphic to S. 
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8.2. Linear Recurrences of Maximal Period over Galois Ring. This is 
the the most deeply studied and very interesting in various applications class of 
linear recurrences over rings. Let R = GR{q^,p'^) be a Galois ring of order g" 
and of the characteristic p” {p is a prime, q = p’’), and let F{x) GV = R[x] be a 
monic polynomial of degree m. Then for any LRS u € Ln{F{x)) the inequality 
T{u) < {q"^ — l)p"“^ holds. If in this situation T{u) = (g™ — l)p"“^. then we 
say that the sequence m is a linear recurring sequence of maximal period (MP- 
recurrence) of rank m over a Galois ring R. 

For any monic polynomial F{x) G V there exist A G Ng, t G N such that 
F(x)jx^(x* — e). The minimal t with this property is called a period of Ffx) and 
denoted by T{F). 

Denote by u and F the images correspondingly of a sequence u and a poly- 
nomial F under the natural homomorphism R R = R/pR. Note, that 
if degF{x) = m, then T{F) < T(F{x))p^-^ < {q^ - l)p^~\ If T{F) = 
(g™ — I)p"“^, then F is called a polynomial of maximal period {MP -polynomial) 
over R. 

Proposition 46 Under the above assumptions an LRS u G Lr{F) is an MP- 
recurrence of rank m if and only if F{x) is an MP-polynomial over R and m yf 0. 

In [33] a simple algorithm for the building an MP-polynomial was presented. 
It has the following simplest form for p = 2: let i? = GR{q^ , 2"), g = 2’’, and let 
G{x) G R[x] be a monic polynomial of degree m such that T{G) = — 1. Let 

the polynomials G[°l(a;), ... € P be defined recursively: G[°l(a;) = G{x), 

and if G[*l(a;) = G|q^(x^) -I- then 

G['=+H(x) = (-l)-(Gg(a:)2 - xGfl^{xf). 



Theorem 47 Any polynomial of the form 

F{x) = G^^\x) 2A{x), degA{x) < m, A{x) yf 0,e 



is an MP-polynomial 

There are some simple conditions for the polynomial over residue ring Z^n to 
have a maximal period. 



Theorem 48 ([33]) Let F{x) = x™ -I- Um-ix 



,m— 1 



oo be a polynomial 



over Zpn^ such that T{F) = p™ — 1 . Then F(x) is an MP-polynomial when 

1. p > 2 and: ^ ag ( mod p^), 

or F(x) = X™ -I- OfcX^ + oq, m > p — 2 . 

2. p = 2 and: (a) m is even, ag = e ( mod 4 ); 

, . . / 6-1-20002 ifdi = e; 

^ ^ 4 ( 2 (e -I- O0O2) if oi = 0 ; 

or (c) F{x) = X™ -I- OfcX^ -I- oo, Ofc, oq G {— e, e}, (m, oq) yf {2k, e). 
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9 Pseudorandom Sequences Generated by Polylinear 
Recurrences 



Here we restrict ourselves only with some illustrations of the statement that poly- 
linear sequences over a Galois ring are good source of pseudorandom sequences. 
More detailed information on this subject is contained in [33,36]. 

9.1. Coordinate Sequences of MP- Recurrence. Let u be an MP-recurrence 
of the period (p™ — with the minimal polynomial G{x) of the degree m 

over the ring R = Zpn. Any item u{i) of the sequence u has standard p-ary 
decomposition: 

u{i) = Uo(i) + ui(i)p + ... + Un-i(i)p"'~^; Us(i) G 0,p- 1. 



The latter gives us n sequences Mo,...,u„_i over the field Zp. For sufficiently 
large but acceptable values of m and s the sequence Us is a very good source 
of pseudorandom numbers. Of course Ug is an LRS over Zp. Let rku^ be rank 
or linear complexity of Ug'. the degree of its minimal polynomial. Apparently we 
can consider Ug as an “approximation” of a random sequence only if rkrtg is 
large enough. There are some estimations of this parameter for the simplest case 

p=2. 

For k, I gN denote b{k, 0) = k, b{k, 1) = 0, if fc < 21, and in other cases 



b{k,l) = k-2l + 



1, if Z is even or Z = 1, 

2, if Z is odd Z > 3. 



Theorem 49 If p = 2 then rku* < X]Lo (^ + 1) ' Et=&(/+i)+i (T)- 

Due to the limits on the length of the text, we point out here only one of the 
earliest and simplest lower estimations of rank. 

Theorem 50 (A. Nechaev, 1982 [33]) Ifp= 2 then the polynomial G{x) can 
he chosen such that for s G 3, n — 1 



rkus>(2'* 




2'=-kl) 



m \ 

2fc+i + lj 




(9.1) 



where e G {0, 1} and e = 1 for m < 14 and m = 20. 

We conjecture that e in (9.1) is always equal to 1. The lower estimations of rkug 
for p = 2 are also given in [13]; these estimations do not contain the first and the 
second summands of (9.1). In [33] one can find more precise rank estimations for 
Ug for p = 2 as well as for p > 2. The PC-calculations, using these estimations 
give, for example, that the polynomial G{x) G Zpn [x] can be chosen so that 
for p = 2 

if m = 11, then 3383 < rkus < 5340, 59,703 < rkwy < 128,430; 

if m = 31, then 1.37 • 10^ < rkus < 1.53 • 10^, 6 • 10^° < rkur < 10^^; 

for p = 5 

if m = 11, then 2 • 10® < rkus < 10®, 10^^ < rkwy < 8 • 10^^; 

for p = 11 

if m = 5, then 10® < rkus < 2 • 10^, 



2 • lO^o < rkw7 < 3 • 10“. 
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9.2. The Distribution of Elements on a Cycle of an MP-Recurrence. 

Another important requirement to the pseudorandom sequence is the “unifor- 
mity” condition on the distribution of elements and fc-tuples on its long enough 
segments. The results on this topic are summarized in [33,36]. We present here 
only one of new results. Let F(x) be an MP-polynomial of degree m over a 
Galois ring R = GR{q^ ,p^), and let u e Lr{F) be a MP-recurrence of period 
T = (g™ — Let 0 < < . . . < ife < T be fixed integer numbers, and 

let oi, . . . , Ofc be fixed elements of R. Denote by v the number of solutions 
i e 0, r — 1 of the system of equations 

u{i + i\) = ai, , u{i + ik) = ak. 



Theorem 51 ([26]) If the system of residues of polynomials ; F'’ G 

R[x] modulo F{x) is linearly independent over the field R, then 



T 

w 






p3(n-l)/2^^ 



More precise results on distribution of elements on cycles of linear recurring 
sequences over can be found in [35] and [33, § 27]. 

9.3. Weight Characteristics of MP- Recurrence over R = Gi?(<7^,4). In 
[34,40] the full description was given of the possible variants of distributions of 
elements on cycles of MP-recurrences not only over Z4 [33] but over any Galois 
ring R = GR{q^ , 4) of the characteristic 4. 

Let F{x) G R[x] be a monic polynomial of the degree m such that its period 
T = T{F) is equal to t = g™ — 1 {distinguished polynomial) or to 2r (MP- 
polynomial). Let u G Lr{F),u yf 0 and A^„(c) be the number of solutions i G 
0, T — 1 of the equation u{i) = c for a given c G R. The description of possible 
types [A^„(c) : c G R] in Lr{F) and their multiplicities is obtained, i.e. the 
complete weight enumerator of the code lP^~^{F) is described. These results 
were based on the presentation of LRS u using the trace- function in Galois rings 
[47,33,39] and on the theory of quadrics over Galois fields of characteristic 2 (see 
[59]). For brevity we give only the description of possible values of iV„(c). Let 
A = [m/2] be the integer part of m/2 and (5c, 0 be the Kronecker delta. 



Theorem 52 If F{x) is a distinguished polynomial then for any c G R 
Nu{c) = ± wq^~^ - 6c, 0 , 

where w G {1, g — 1} if m = 2A + 1, and w G {0, 1, q — 1} if m = 2A. There 
exists not more than 2g -|- 1 different types [iV„(c) : c G R] in Lji{F). 

In particular for a nonzero c we have |A^„(c) — g™“^| < Note that 

in [31] the trigonometric sums approach gives more rough estimation, with g™/2 
in the right part. 
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Theorem 53 If F{x) is an MP-polynomial then for any c G R 

iV,(c) = 2(7™-2± V^-2(5c,o, 

where w G {0,2,g — 2,g,2(g — 1)} if m = 2X + 1, and w G {0,1,2,<7— 1,2(<7 — 1)} 
if m = 2A. There exist not more than 2g + 1 different types [Nu{c) : c G R] in 
Lr{F). 
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Abstract. We present an algorithm to calculate generators for the in- 
variant field fc(x)'^ of a linear algebraic group G from the defining equa- 
tions of G. 

This work was motivated by an algorithm of Derksen which allows the 
computation of the invariant ring of a reductive group using ideal theo- 
retic techniques and the Reynolds operator. The method presented here 
does not use the Reynolds operator and hence applies to all linear alge- 
braic groups. Like Derksen’s algorithm we start with computing the ideal 
vanishing on all vectors (^, Q for which ^ and are on the same orbit. 
But then we establish a connection of this ideal to the ideal of syzygies 
the generators of the field fc(x) have over the invariant field. From this 
ideal we can calculate the generators of the invariant field exploiting a 
field-ideal-correspondence which has been applied to the decomposition 
of rational mappings before. 



1 Introduction 

Invariant theory has a long tradition as well as new applications. The branch 
of constructive invariant theory is mostly interested in questions like finding 
generators for invariant rings of a given group. Much research has been done for 
the case of finite groups (see e. g. [10,16,14]). Invariant fields for finite groups are 
just the quotient fields of the corresponding invariant ring so problems in that 
area were rather structural like “is a given invariant field rational?” [9] which is 
an instance of the famous rationality problem. 

But in the case of linear algebraic groups generating systems of invariant fields 
become more interesting. Every invariant field has a finite generating system even 
if the corresponding invariant ring does not. 

Recently Derksen showed how to compute generators for invariant rings also 
of infinite reductive groups using ideal theory. This motivated our study of an 
ideal theoretic approach to invariant fields. This paper presents an algorithm to 
compute generators for the invariant field of a linear algebraic group which is 
given by its equations. 

The paper is structured as follows: Section 2 will explain a correspondence 
between fields and certain syzygy ideals. This correspondence will later on allow 
us to pass from the ideal theoretic part of our algorithm back to function fields. 
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Section 3 introduces the equivalence relation relating those points which are on 
the same group orbit — the so called graph of the action. The ideal describing the 
Zariski closure of this equivalence relation can be computed from the equations 
defining the group. In Section 4 we will present two results relating the ideal 
describing the graph of the action and the syzygy ideal corresponding to the 
invariant field we are looking for. Because of the field-ideal-correspondence being 
constructive we will be able to compute the invariant field following this path. 
The last section gives four examples which were computed using Maple V. 

2 A Field Ideal Correspondence 

To be able to use ideal theoretic methods to deal with function fields we shortly 
restate the field-ideal-correspondence which was used in [13] to find decomposi- 
tions of rational mappings. 

Given fields fc(f) = k{fi, . . . , fr) and /c(x) = k{x\, . . . ,Xn) finitely gener- 
ated over a field of constants k and both being contained in some field k{V) = 
Quot(fc[Xi, . . . , Xs\/l{V)) we formally define the syzygy ideal mentioned in the 
introduction. 

Definition 1. Let fc(f) and fc(x) he fields lying over a field k of constants and 
let {x}, {y} denote the sets of generators of k{{) and fc(x) over k respectively. 
Furthermore let Uxe{x}i^^} ® variables and /c(f)[Z] be the ring of 

polynomials in these variables over the field k{{). Then the ideal Jk(x)/k{f) ^ 
/c(f)[Z] of all algebraic relations of the set {x} over k{f) is defined as 

({x - Zx\x e {x}}) n fc(f)[Zj. 

The ideal ({x — Zx\x G {x}}) used in the definition can also be viewed as a 
syzygy ideal, namely Jk(x)/k{x) representing the trivial extension fc(x)/fc(x). 

There is an alternative characterization of this ideal which was given in [12]. 
It is the key to the properties of the syzygies ideal which are used in the following. 

Lemma 1. The ideal Jk(x)/k{f) equals the kernel of the specialization homomor- 
phism 

: k{f)[Zx^,. . .,Zx„] fc(f)(x), 

Zx 1-^ X. 

In this paper we just need a special case of the ideal Jk(x)/k(f) namely we 
restrict ourselves to /c(f) being a subfield of fc(x). As we can think of the field 
/c(x) and its generating system {x} as being fixed we will just write Jfc(f) instead 
^f Jk{x) /fc(f)- 

In this special case we can express the generators of /c(f) in terms of the 
xi, . . . ,Xr. An alternative characterization of the ideal Jk(f) can then be given 
(following [13]). This characterization has computational advantages and for /c(x) 
being fixed we get a correspondence between intermediate fields of k{x)/k and 
the ideals Jk{f). For this correspondence we need a result from [13]: 
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Proposition 1. Let fc(f) < k{x) he fields finitely generated over k and let the 
generators fi,...,fm of k{f) over k he expressed in :x. = xi,...,Xr as f\ = 
, . . . , fm = ■ Let Z denote ■ ■ ■ , then we define an ideal 

I = (ni{Z) - -^di{Z), . . .,nm{Z) - -Z^dm{Z)'j 

and for d = di{Z) dm{7d) we get: 

Jfc(f) = iJk + /) : 

Furthermore the eoeffieients of a redueed Grohner basis o/ Jfc(f) •A:(x)[Z] generate 

m- 

In the remaining paper we will not have algebraic relations among the gen- 
erators x\,. . . ,Xr of fc(x). Therefore Jk will be the zero ideal and we need not 
worry about computing Jk- 

For an ideal / the saturation I : d°° is effective ([1] Algorithm idealdiv2) 
and so is the problem of representing a field element in some generators [17,8,13]. 
Thus the ideals of Proposition 1 can be achieved through Grobner basis compu- 
tations if the generators for the field are known. 

We can conclude for the fields k{x), k{f) being finitely generated over a com- 
putable field k and contained in a field k{V) = Quot(fc[Ai, . . . ,Xs\/l{V)) that 
the ideal Jk(t) can be computed effectively. 

Thus the following field ideal correspondence is constructive: 

Corollary 1. For suhfields fc(f) of fc(x) being finitely generated over a eom- 
putable field k and contained in a field k{V) = Quot(/c[Ai, . . . , As]/I(P)) the 
mapping 



. {k{i)\k < fc(f) < fc(x)} -> {/[/ < A:(x)[Z]} 

fc(f) ^ Jfc(f)-fc(x)[Z] 

is inclusion preserving, injective and C as well as C~^ can be computed effectively 

But the problem in this paper is to calculate field generators. Hence in the 
next two sections we will develop a method to compute the Ideal Tfc(x)/fc(x)G • 
/c(x)[Z] from the ideal Hq of the defining equations of the graph of the action of 
a group G. With the above corollary we will then see that generators for fc(x)'^ 
can be computed effectively from Hg- 



3 The Graph of the Action of a Linear Algebraic Group 

In this subsection we give a formal definition of the equivalence relation which 
relates two points iff they are on the same G-orbit and we restate the fact that 
the ideal Hq of the defining equations of the graph of the action can be computed 
effectively. 

Then we will state a theorem of Derksen which motivated this paper. 
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Definition 2. For a group G acting on a finite dimensional vector space V we 
define the following relation from V to V: 

{(e,C)|35GG:5-e = C}- 

This relation will be called the graph of the action of the group G. The Zariski 
closure of this relation will be called the closed graph of the action of the group 

G. 

The ideal of all polynomials over vanishing on the graph of the action of G 
will be denoted by Hq ^ fc[X, Z] . 

For the first step in our algorithm we have to compute generators for the 
ideal Hq from the equations defining the group G. 

Lemma 2. Let (pi, . . . ,Ps)^k[mn,mi 2 , ■ ■ ■ ,rn„„] denote the ideal of those poly- 
nomials over k vanishing on G C k . Let I < fc[TOn, mi 2 , • ■ ■ , runm X, Z] de- 
note the ideal generated by pi, ... ,ps and the entries of 



/ mil ■ ■ 


• rilnl \ 


f -^aji \ 


(Xi\ 


\to„i . . 


• '^nn / 




UJ 



Then Hq = I C\ fc[X, Z]. Hence Hq can be computed using elimination. 

Proof. The proof is obvious from the Extension Theorem [2] : consider the points 
of the graph of the action of G as all those points which extend to a point on the 
variety of /. As Hq describes the Zariski closure of all partial solutions which 
can be extended to points on the variety of L the ideal Hq also describes the 
Zariski closure of the graph of the action of G. □ 

But first we will give an additional motivation for the study of the Zariski 
closure of the graph of the action of a group namely its connection to the invari- 
ant ring of that group. Derksen was able to turn the proof of Hilbert’s finiteness 
theorem into an algorithm (see [4]). The algorithm of Derksen makes use of the 
ideal defining the graph of the action and thereby establishes the connection 
between the two structures. 

Next we will shortly state the result of Derksen following the presentation of 
Decker and De Jong [3]. Before that we restate the finiteness result of Hilbert: 

Theorem 1 (Hilbert). 

Let G be a reductive group, let * : /c[X] ^ k[X]^ , f ^ f* denote the Reynolds 
operator, and In be the ideal generated by the invariant ring /c[X]'^ in /c[X]. 
Then for homogeneous polynomials fi, . . . , fr we have 

(/!,...,/,)= In iff k[fl ...,/;] = fc[X]^. 

Because of k\X\ being noetherian the invariant ring is finitely generated. 
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For reductive groups it suffices to know generators of the ideal In and one can 
derive a generating system of using the Reynolds operator. The algorithm 

of Derksen computes generators of In from the ideal defining the graph of the 
action. 

Theorem 2 (Derksen). Let Z] denote the defining ideal of the closed 

graph of the action of G then 

{{Z)+HG)nk[X\ = iN. 

The proof can be found in [3] . 

In the next section we will present a variation of Derksen’s algorithm which 
works for invariant fields of algebraic groups. This algorithm will be independent 
of the Reynolds operator and will do for non-reductive groups as well. 



4 Calculating Generators for Invariant Fields 

In this section we want to establish a connection between the Ideal Hq describing 
the graph of the action of a group G and the ideal Jfc(x)c corresponding to the 
invariant field. This connection will then be exploited to calculate invariant fields. 

But first we look at a result of Rosenlicht connecting generating systems of 
Invariant fields and the graph of the action. 

Theorem 3 (Rosenlicht). Let G denote a linear algebraic group and let f = 
(/i, . . . , fr) be a rational mapping for which fc(f) = fc(x)‘^. Then there exists a 
G-invariant open subset U of k such that f is regular on U and the relation 
f~^of c k xk restricted to UxU equals the graph of the action of G restricted 
toUxU. 

For a proof see [15, Satz 2.2.] 

As a corollary we get: 

Corollary 2. Using the same notation as above we can conclude: 

1. We denote the extension of the ideal Jfc(x)c to fc(x)[Z] by Jk(x)o-k(x.)[Z]. Let 
( Jfc(x)G-fc(x)[Z]nfc[x, Z])|x=x be the ideal of all polynomials in Jfc(x)G-fc(x)[Z] 
where the algebraically independent parameters xi,...,Xn are replaced by 
variables Xi, . . . , X„. Then there exists an open nonempty subset U of k 
such that we get the following equality for the varieties restricted to U x U: 

^{Hg)\uxU = V((Jfc(x)G • /c(x)[Z] n /c[x, Z])|x=x)|£/X(7 

2. The ideal Hq and ( Jfe(x)c •fc(x)[Z]nfc[x, Z])|x=x have associated prime ideals 
in common. 

3. Let i^G|x=x ^ k{'x)[Z] denote the ideal generated by the image of Hq under 
the specialization homomorphism fc[X, Z] — > fc(x)[Z] defined by Xi Xi. 
Then the ideal Jk{x) ' k{'x.)[Z] and the ideal i^G|x=x have associated prime 
ideals in common. 
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Proof. For an algebraic group G the graph of the action is a projection of a 
variety. Hence from the Extension Theorem [2, Theorem 4 § 6 Chapter 3] we 
can conclude that there exists an open nonempty subset Ui C k such that the 
closed graph of the action Y{Hg) is on C/i x C/i equal to the graph of the action of 
G. Furthermore it is easy to see that there is an open nonempty subset U 2 Gk 
such that (Jk{x)o • k{x)[Z] n /c[x, Z])|x=x describes on the open set U 2 x U 2 the 
fibers of a rational mapping f = (/i, . . . , fr) for which k{i) = k{x)^ . From the 
theorem of Rosenlicht we can deduce the existence of an open set C/3 such that 
on U 3 X II 3 the fibers of (/i, . . . , fr) equal the graph of the action. Hence for 
U = Ui n U 2 C\ II 3 we get the equality stated in point 1. of our corollary. 

To show point 2. let V denote the complement of the open set U x U. To 
shorten notation we will write J for {Jk(x)<^ •/c(x)[Z]nfc[x, Z])|x=x and H for Hq 
in the remaining part of the proof. We get V(iF) = (V(iL) n 17 x U)U{Y{H)nV) 
and V(J) = (V(J) nU xU) U (V(J) C V). As V(J) nU xU = V(i7) n 17 x C7 
all their associated prime ideals must be equal. 

Since V(i7) C 17 x 17 and V(i7) n V are disjoint and the latter is closed 
we can conclude that no associated prime of I(V(i7) n E)) is contained in 
I(V(i7) n 17 X 17). Thus all primes associated to I(V(i7) n 17 x 17) are also asso- 
ciated to H. The same argument applies to J and thus the ideals H and J share 
all prime ideals which are associated to I(V(i7) n 17 x 17). 

For the proof of point 3. we first observe that specialization of our variables 
X to field elements x is equivalent to localizing our ring fc[X, Z] with the multi- 
plicatively closed set fc[X] \ {0} (and then substituting the symbols Xi, . . . , X„ 
by the symbols xi, . . . ,Xn). 

Since V(i7) n 17 x 17 describes exactly the graph of the action it does not 
contain any points (0, . . . , 0, C) with f not equal to the zero vector. Hence no 
associated prime P of I(V(i7)nl7 x U) does contain apolynomial from fc[X]\{0}. 
Thus all prime ideals associated to I(V(i7) n 17 x 17) remain associated after 
localization (see [18, Theorem 16 (b), Theorem 17 § 10 Chapter IV]). It remains to 
show that these associated prime ideals are also associated to the localization of 
H and J. All those associated prime ideals which contain an element of fc[X]\{0} 
become the whole ring when localized. Hence they become redundant and all 
other associated prime ideals remain associated prime ideals according to [18, 
Theorem 17 § 10 Chapter IV]. 

□ 

Next we shortly look at the action which the group G induces on the associ- 
ated prime ideals of Jk{x)'^ ■ k(x)[Z]. 

Lemma 3. Let the group G operate on fc(x)[Z] by operating on the coefficients 
from k(x). Then the group G operates transitively on the associated prime ideals 
of Jk(^)G ■ k{x)[Z]. 

Proof. Let k{x)^ig denote the algebraic closure of k{x)^ in k{x). Then the field 
k{^)aig unique maximal field lying algebraically over k{x)^ and being 

contained in fc(x). As the fields /c(x)® and k{x) are closed under the group 
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action of G the field k{x)^ig must by definition be closed under the group action 
of G, too. 

The group G operates on k{x)^ig as a group of automorphisms. Hence G 
operates like a finite group on because the group of automorphisms of 

k{x)^ig leaving k{x)^ invariant must be finite. 

From the book of Eisenbud [5, Proposition 13.10’] we know that the group 
G then operates transitively on the associated primes of Jfc(x)c • k{x)^ig[Z]. 
The claim of our lemma can now be concluded by looking at the algorithm 
for the computation of primary decompositions given by Gianni, Trager, and 
Zacharias [6]. This algorithm never introduces transcendental field extensions 
of the ground field (here k{x^)) and the primary decomposition of the ideal 
•^fc(x)<3 • k{x)[Z] is already found in □ 

The relation between the ideals Hq and Jfc(x)c • fc(x)[Z] can now be stated: 

Theorem 4. Let Hg\x=x denote the ideal generated by the image of Hg under 
the specialization homomorphism fc[X, Z] ^ k(x)[Z], Xi Xi and let Jk(x)r^ ■ 
k{x)[Z] be the ideal generated by Jk(x)r^ k{x)'^[Z] in k{x)[Z]. Then we get the 
following equality: 



Hg\x=x = dfc(x)G • fc(x)[Z] 

Proof. The ideals Jfe(x)c • A:(x)[Z] and Hg|x=x are invariant under the group 
action of the group G on the coefficients (g k(x)). Furthermore they have (at 
least) one associated prime ideal in common. As the group G operates transi- 
tively on the associated primes of Jfc(x)G • A:(x)[Z] we can conclude that every 
prime associated to Jfc(x)c • fc(x)[Z] is also associated to Hg\x=x- Hence we have 
^^gIx^x C Jfc(x)G • fc(x)[Z] if the two ideals are radical. The ideal iFG|x=x is 
radical as it contains all polynomials vanishing on G • x x G • x. The radicality of 
Jk(x)^ ■ k{x)[Z] can be deduced from Jfe(x)G < fc(x)'^[Z] being prime and the field 
extension k{x) / k{x)'^ being separable [15, Section VI Lemma 1.5], [19, Corollary 
to the Lemma in § 1 1 Chapter VII] . 

The inclusion Jfc(x)G •A:(x)[Z] C iLG|x=x can be deduced from HgIx^x being 
by construction the maximal ideal vanishing on G • x x G • x and Jfc(x)G • fc(x)[Z] 
also vanishing on G • x x G • x. □ 

This relation between the ideals Hg and Jfc(x)G • A:(x)[Z] can be used to 
calculate generators for an invariant field k{x)^ . 

Corollary 3. Let C denote the set of coefficients of a reduced Grobner basis of 
the ideal Hg|x=x- Then 

k{C) = k{xf. 

The proof is obvious from the above theorem and Proposition 1. 

To make the treatment of the ideals complete we give a method to compute 
Hg from Jfc(x)G • fc(x)[Z]. 

For this we need the following result: 
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Lemma 4. Let (Jfc(x)c • fc(x)[Z] n fc[x, Z]) |x=x be the ideal of all polynomials 
from Jfc(x)G • ^(x)[Z] with the algebraically independent parameters xi,...,Xn 
replaced by the variables X\, . . . , Furthermore let Hq ^ fc[X, Z] be as above. 
Then we get the equality: 

{Jk{x)c: • fc(x)[Z] n k[x, Z]) |x=x = He 

Proof. Let g(X, Z) be an element of (Jfc(x)o • fc(x)[Z] nfc[x, Z]) |x=x- Then it 
vanishes on every point of the graph of the action hence q is in the ideal Hq of 
all polynomials vanishing there. 

For the other direction let g(X, Z) be contained in FIg. Then g(x, Z) is con- 
tained in Jfc(x)G • /c(x)[Z] (according to Theorem 4) and as no denominators 
occur it is also contained in Jfc(x)c • fc(x)[Z] n /c[x, Z]. Hence g(X, Z) is element 
of (Jfc(x)G • fc(x)[Z] n fc[x, Z]) |x=x- □ 

Using this lemma we can show the main ingredient for an algorithm to com- 
pute the ideal Hq from the ideal Jfc(x)G • fc(x)[Z]. 

Corollary 4. Let fi, . . . , fr G fc(x)[Z] be a generating system for Jfc(x)G-fc(x)[Z] 
and pi, . . . ,pr G k[:s.,Z] be this generating set after clearing denominators (mul- 
tiplying by a suitable polynomial from k[x]). Then Hq equals 

^ {pi,.. . ,Pr) : d°°. 

dGk[x]\{0} 

Proof. Following Lemma 4 we have to show that X)d6fc[x]\{o}(L'i’ ■ • ■ ^Pr) ■ 
equals (Jfc(x)G • fc(x)[Z] n fc[x, Z]). Let gbe an element ofX)dgfc[x]\{o} (Pi> ■ • ■ ' 

d°° then there exists a polynomial d G fc[x] such that dq G (pi, . . . , Pr) hence dq G 
(/i, . . . , fr) and as (/i, . . . , fr) is defined over k{x) we have q G (/i, . . . , fr) and 
as g is a polynomial we get q G {fi, . . . , fr) C\ fc[x, Z]. Conversely let q be an ele- 
ment of (Tj,(x)g • fc(x)[Z] n /c[x, Z]). Then there is a representation q = Tfi 

and (clearing the denominators with a, d G k[x\) yields a representation dq = 
J2l=i <l'iPi- Hence dq is an element of (pi, . . . ,Pr) and as Z)d6fc[x]\{o}(^'i’ • ■ ■ ^Pr) ■ 
d°° is saturated with respect to every polynomial in /c[x] we have proven the 
above claim. □ 

We need two more lemmata (both taken from [11]) to make the above corol- 
lary effective. 

Lemma 5. 



1. For a primary ideal Q of a polynomial ring and a polynomial d the ideal 
Q : d°° equals Q or (1). 

2. For an intersection Ji C ••• fl It of ideals Li, . . . ,Lt of a polynomial ring and 
a polynomial d we have: 



(/i n • • • n It) : = 7i : n • • • n It : d“. 
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Proof. 1. If the ideal Q contains a power of d then Q : d°° equals (1). Now let 
Q be a primary ideal with ^ Q for all n then Q : d°° yf (1). Suppose 

Q : d°° yf Q then there exists & d} ■ p € Q with p ^ Q but because of Q 
being primary there must be a power of d^ which is an element of Q. This 
contradicts d^ ^ Q for all n. 

2. This follows from Proposition 10 in Chapter 4 § 4 of [2] and the fact that 
there always exists an I large enough such that I : d°° = I : d} (see Exercise 8 
in Chapter 4 § 4 of [2]). 

□ 

The algorithm can now be derived from Corollary 4 and the following result: 
Lemma 6. Let I < fc[x][Z], I' < /c[x] he ideals and I' prime. Then the ideal 

dGk[x.]\I' 



is effectively computable. 

Proof. By choosing a term order fulfilling x <C Z it is decidable if a given ideal 
of fc[x][Z] contains an element of fc[x] \ Let I = Qi C\ ■■■ C\ Qt he a, primary 
decomposition and Q be the set of all those primary components which do not 
contain an element of fc[x] \ 

We claim that 

E n 

dGfc[x]\/' Q^Q 

The ideal Edefc[x]\/' ^ equals Edefc[x]\/'(Qi C • • • C Qt) : which is 

equal to Edefc[x]\/' Qi ■ d°° D ■ ■ ■ D Qt : d°° . 

For every primary component Qi which is not an element of Q there exists 
a summand where Qi is replaced by the ideal (1). Let dg^ denote a polynomial 
from /c[x] \ /' for which Qi : dg. = (1). Since /' is prime we have Wg.f^Q dg^ ^ /' 
hence there exists a summand 

Qi : ( n dg,rn---nQt : ( J] 

Qi^Q QiiO. 

in Ed6fc[x]\/' d '■ d°° which contains all other summands. This summand therefore 
equals the whole sum. □ 



5 Four Examples 

We will give four detailed examples in this section. 

Our first example will be the invariant field of a representation of the group 

(C,+). 
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Example 1. For the group Gi = { ^ J |a € C} which is isomorphic to (C, +) 

the defining equations are the linear equations mu = 1, m2i = 0, m22 = 1- From 
the ideal {mnZi + mi2-^2 — W2i^i + W22^2 — X2, mu — 1, TO21, ^22 — 1) 
describing the points (rnn,mi2,m2i,m22, Zi, Z2, Xi, X2) for which 

/ mil mi2 \ / Zi \ ^ /Xi \ 

\m21 m22 ) \^2 ) ) 

we get by elimination of the variables mu,mi2,m2i,m22 the defining equa- 
tions Z2 — X2 for the closed graph of the action. Specializing the variable X2 to 
the field element X2 and computing a reduced Grobner basis yields C(xi , X2)^^ = 

C(X2). 

For our next example we consider the non-compact group Id3G>GL2(C) where 
the group GL2 (C) operates synchronously on three vectors. 

Example 2. The defining equations of the group GL2(C) 0 GL2(C) 0 GL2(C) 
which consist of block matrices are simply mij = 0 for all those entries m^ which 
are not within one of the three 2 x 2-blocks. For the group G2 = Ida ® GL2(C) 
we have the additional equations that the following polynomials must equal 
zero: mu - mas, ^12 - ma4, m2i - m4a, m22 ~ mu, mu - mss, mi2 - mse, m2i - 
Wes, w-22 — mee, i- e., the blocks must be equal. Proceeding as in the last example 
we get the following ideal describing the closed graph of the action: 

{Z2Z5X3 — Z2Z^X^ — Z^Z^Xi 0 Z^ZiX^ 0 ZqX\Z^ 0 ZqX^Zi, 



—Z2XQZ3 0 Z2X4ZS — Z^X2Z^ 0 Z4^Z\Xq 0 Z^Z^X2 — .^e-Z^i^ 4 , 

-X 2 ZsXa 0 X 2 ZaXs 0 X 4 ZSX 1 - X 4 Z 1 XS - XeZaXi 0 XgZiXa, 

Z4XSX2 - ^6^2X3 - Z2X4XS 0 Z2X4XS 0 Z2X3X6 0 XiZ^Xi - Z4X1X6). 

Again specializing the variables Xi to field elements xi and computing the re- 
duced Grobner basis 

, XiX4-X3a;2„ , -X1X6 0 XsX2 „ 

fzii -I Zs -I Z3, 

X3X6 — X4XS X^Xq — X4X5 



-XiXq 0 X5X2 „ , X1X4 - X3X2 „ . 

Z4 -I Zg I 

X3XQ — X4X3 X3X3 — X4X5 



we get the invariant field = C{xi,X2,X3,X4,X5,xef^ 



Our third example has a motivation from physics. In the paper [7] invariants 
were applied to show that two quantum states are not locally equivalent, i.e., 
they are not on one orbit of the group SU2 (C) 0 SU2 (C) of local unitary trans- 
formations. Local in this context refers to the entanglement not being increased 
by local transformations. 
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This leads to the general question how to compute invariants of tensor prod- 
uct groups. We will here consider the group SL2(C)(8>SL2(C). The group elements 
are of tensor product form 

/ ae af be bf \ 
ag ah bg bh 
ce cf de df 
y eg ch dg dh J 

where the variables a,b, . . . ,h represent elements of C. This structure is directly 
reflected in the defining equations. 

Example 3. The deflning equations for the group G3 = SL2(C) ® SL2(C) are 
mil — ae, mi2 — af, mi3 — be, mu — bf, m2i — ag, m22 — ah, m23 — bg, m24 — 
bh, m^i — ce, m^2 — cf, — de, ms4 — df, m4i — eg, — ch, — dg, mu — dh 
for the group consisting of tensor product matrices plus ad—bc—l,eh — fg—lfor 
the determinants of the tensor factors being one. Calculating the ideal describing 
the closed graph of the action of G3 we get {Z1Z4 — Z2Z3 — X1X4 + X3X2). 
Specializing the variables Xi to held elements Xi, computing a reduced Grobner 
basis, and extracting coefficients yields: C(xiX4 — X3X2) = C(xi,a;2,a:3,X4)'^3. 

An interesting example for calculating the ideal Hq from generators of the 
ideal Jfc(x)c comes from the cyclic group G generated by u) ■ Id„ where u> denotes 
an primitive root of unity. The invariant held k{x)'^ can be generated by 
n rational functions alone: fc(x)'^ = k{xf, , ^). But to generate the 

invariant ring one needs all (exponentially many) monomials of degree 1. 

The ring k[x]^ proves that the Noether bound on the number of ring generators 
one needs is tight [ 16 , Proposition 2 . 1 . 5 ]. This change from few held generators 
to many ring generators can be observed when passing from Jfc(x)c to Hg- 

Example 4- For the group G = {i ■ W3) with the invariant held k{xf, the 

ideal J}.^„^-^g<<C{xi,X2,X3)\Zi, Z2, Z3] is generated by Zf—x\,xiZ2—X2Zi,xiZ3— 
X3Z1 (it is already saturated with respect to ^2^3)- 

Substituting the held elements a;i,a;2,a;3 by variables Xi, X2, X3 and com- 
puting a saturation with respect to all d G /c[Ai, A2, A3] yields the generating 
system -Xf+Zf, -XfX2 + Z^Z2, -X3Xf + ZfZs, -X^Xf + Z^Z^ -AgAaA^-h 
Z3Z2ZI -XiXf + ZiZl -A|Ai + ZiZi, -A3AIA1 + ZIZ3Z1, -A1A2AI + 
Z2ZIZ1, -A|Ai + ZiZi, -X1Z2 + X2Z1, X3Z1 - X1Z3, -Xi + Zi, -X3XI + 
ZIZ3, -A|A| + ZiZi, -XIX2 + ZIZ2, X3Z2 - X2Z3, -A| + Zl 

Following Derksen’s algorithm we set the variables Zi,Z2,^3 to zero and 
obtain all monomials of degree four. This already is a generating system for the 
invariant ring. 

Acknowledgements: We thank Rainer Steinwandt for [ 18 , 19 ]. 
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Abstract. An efficient algorithm is presented which for any finite field 
Fq of small characteristic finds an extension Fqs of polynomially bonnded 
degree and an element a € Fqs of exponentially large multiplicative 
order. The construction makes use of certain analogues of Gauss periods 
of a special type. This can be considered as another step towards solving 
the celebrated problem of finding primitive roots in finite fields efficiently. 



1 Introduction 

One of the most important unsolved and notoriously hard problems in the com- 
putational theory of finite fields is to design a fast algorithm to construct primi- 
tive roots in a finite field of q elements. It is demonstrated in [12] that one can 
find a primitive root of Fg in exponential time for any e > 0 and so far 

this is the best known result. Even the Extended Riemann Hypothesis (ERH) 
does not imply any essentially better results, see Chapters 2 and 3 of [13]. 

On the other hand, for many applications instead of a primitive root just 
an element of high multiplicative order is sufficient. Such applications include 
but are not limited to cryptography, coding theory, pseudo random number 
generation and combinatorial designs. 

Following this idea, the authors [9] have proposed some algorithms to con- 
struct elements of exponentially large order for some sufficiently ‘dense’ sequence 
of extensions of a prime field Fp of small characteristic. The algorithms are based 
on studying multiplicative orders of Gauss periods in finite fields. As an addi- 
tional bonus in many cases the corresponding large order elements are generators 
of normal bases as well. 

On the other hand, for several applications it is desirable to provide a large 
order element either in some given field Fg or at least in a not too large extension 
FgS of it. 

This work as well as its predecessor [9] provides yet another contribution 
to the theory of Gauss periods over finite fields and their generalizations and 
analogues, which have already proved to be useful for a number of various appli- 
cations [1,2, 3, 4, 5, 6, 7, 8, 9]. Apparently Gauss periods are worth a further study 
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of their properties and areas of possible applications. In particular, it is inter- 
esting to remark that numerous experimental results (which can be found in the 
above papers) indicate that Gauss periods often produce primitive roots and 
thus generate primitive normal bases. 

Here we concentrate on the special case of certain analogues of Gauss periods 
in Fqn of type (n, 2). 

We let r be an integer relatively prime to q, and n = ip(r), the Euler function. 
Let /3 be a primitive rth root of unity; this is an element of F^n . Then we consider 
the element 

a = P [3 ^ G Fqra . (1) 

We recall that if r is prime, then a is called a Gauss period of type (n, 2) over 
Fg. However, for composite r the Gauss period of type (n, 2) are defined in a 
different way, see [2] . 

It is well-known that the minimal polynomial of P over Fg is of degree t, 
where t is the multiplicative order of q modulo r. Thus t\ip(r) = n. 

If we know the minimal polynomial of P over Fg (that is, we have factored 

— 1 over Fg), then that of a can be determined by linear algebra over Fg 
in polynomial time. While a deterministic polynomial time algorithm to factor 
polynomials over finite fields is not known, there are many efficient (probabilistic 
and deterministic, unconditional and ERH-dependent) methods. We just men- 
tion that x’’ — 1 can be completely factored over the filed Fg of characteristic p 
with the following number of arithmetic operations in Fg: 

o (rlogg)*^^^^ probabilistically, 
o p^/^(r logg)'^^^^ deterministically, 
o (rlogg)*^^^! deterministically under the ERH. 

More precise forms of these assertions and further details can be found in [13], 
Section 1.1. 

In this paper, as in [9], we do not estimate the cost of constructing a by 
using (1) (which, as we mentioned above, may depend on the selected computa- 
tional model or some unproved number theoretic conjecture, but is fairly small 
anyway). Our result is an explicit formula for parameters of the construction (1) 
such that the corresponding a is of large period. 

As in in the previous paper [9] , our technique is based on some results about 
the distribution of exponential functions in residue classes modulo an integer. 
Namely, we make use some results of Korobov [10] and [11] which provide very 
precise information about the period and the distribution of exponential func- 
tions modulo an integer divisible by a high power of a small prime. 

2 Main Result 

The following theorem describes how, for a prime power q = p^ , one can select 
the degree s of an appropriate extension of Fg which is polynomially bounded in 
log q and such that the construction (1) for certain values of parameters produces 
an element a G Fgs of exponentially large order. 
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Theorem 1. Let q= he the kth power of a prime number p with gcd(p, fc) = 1 
and let A > 4 he a eonstant. We define 

^ f 5, ifp = i, 

\3, ifp^i, 

r = n = ip{r), s = n/k. 

Then s < is an integer and the element 

a = f3 + [3 ^ G Fpn = Fqs , 

where j3 is a primitive rth root of unity over Fp, has multiplicative order at least 

2ck^^^~^-10k-l 

where 

23/2/5, z/p=3, 

2 -i/ 2 (p _ i)-i^ //p=lmod3, 

23/2(p2 _ z/p=2mod3. 

Proof. The integrality of s follows from the identity 

n = p{r) = p{kH"^) = kip{kr). 

We also have 

s = p{kr) <{i- i)kr~^ < 4kr~^ < 4k^+\ 

To derive the claimed lower bound on the order of a, we denote by vq the 
product of all prime divisors of r. Obviously rg < Ik < 5k. 

Let t and to be the multiplicative orders of p modulo r and rg, respectively, 
so that t < r and to < 5k. Also, let r and tq be the multiplicative orders of 
p modulo I™ and I, respectively. We define 7 as the largest power of I which 
divides — 1. Thus 

p < pTO _ 

Korobov [10], Remark after Lemma 1, shows that 

r > Tor~^ > ^ r-, (2) 

see also [11], beginning of Section 1. If p = 1 mod 3, then tq = 1, and 

Tq _ 1 

pTO — I p — 1' 

If p = 2 mod 3, then tq = 2, and 

Tq _ 2 

pro _ 1 p“^ — 1 




Alogk 
log I 
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If p = 3, then tq = 4, 7 = 1, and from (2) we conclude that 

4 

T> -r. 

5 

In all cases, we have 

t>T> = 2^/'^crk-‘^. (3) 

We also have to < tq < 5k. 

We define h = [(r/2)^/^ — tor/t \ , and consider the set 

S' = {i: 0 < i < t and 1 < (p* rem r) < h}, 

where (p* rem r) G N is the positive remainder of p* on division by r. We note 
that gcd(p, r) = 1. Korobov [10], Lemma 2, implies that 

\ij^S -th/r\<to\ (4) 

see also [11], Fact 2. Thus we have #S < th/r+ to < t(2r)“^/^, and 

2/1 • #S < 2{{r!2f/^ - tor/t)-t(2r)-i/2 ^ 

Now we recall that the degree of the minimal polynomial of a primitive rth 
root of unity j3 over Fp equals t. Let U,V C S be two different subsets, and 

u = ^p\ v=^p>. 

ieu j&v 

Assume that Then 

0 = a^ -a^ = Y[{(3 + (3~Y " H + Yf 

ieu j&v 

= / 3 ”“ n + 1)- 

i&U j&V 

We define the sets 

E = {p* rem r:i £ [/}, F = {p^ rem r: j G K}, 
and the numbers 

eeE feF 

Since i < t = ord^p for each i G S, we have ^ pp mod r for t yf j in S. In 
particular, E ^ F. If, say, A < B, then we find that /3 is a root of the polynomial 

Q{x) = {x^^ + 1) - n + 1) ^ W- 

eeE feE 
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We show that Q{x) is a nonzero polynomial. Indeed, if A < B, this is obvious. 
Otherwise, we have A = B and the monomial occurs in Q with nonzero 
coefficient, where g is smallest element which belongs to one and only one of the 
sets E and F. Furthermore, degQ < 2h#S' < t; thus our assumption a“ = a" 
is false. 

Therefore, we have at least 2"^^ distinct powers of a. From (3) and (4) we 
see that 



#5' > th/r — to> t{2r) — 2to — t/r 

> cr^l'^kr'^ — 2to — tjr = cV^^'^kr^ — 2to — t/r 

> cr/^k~^ -10k-l> - lOfc - 1, 

and we obtain the desired statement. □ 

In particular, the element a described in Theorem 1 is of order at least 

2(c+o(l))fc^''^"^ 

where “o(l)” tends to zero as k goes to infinity. 

3 Remarks 

We give several examples of the parameter selection. Since c > the 

choice 

A = i + l±A2Si£ 

logs k 

yields an element of order at least 2®^“^ in an extension of degree at most 
128^"^^:®. Selecting Z\ = 6 we obtain an element of order at least 

22^''^(fe/p)^-10fc-l 

in an extension of degree at most 4fc^. These estimates are to be compared to 
the order — 1 of a primitive element in F^s . 

A natural way to phrase our question is as follows. Given q = and t, 
compute an element of order at least t in an extension of whose degree we 
want to choose as small as possible. For this task, determining A from the 
equation 

2/y^=/fc2(10fc+l + log2 ^)^ 

(so that Z\ > 4) and applying our construction, we obtain the upper bound 

s < 2p‘^k^{l0k + 1 + log 2 

on the degree s of the corresponding extension; this is polynomial in k and log t. 
We finish by posing several open problems. 

Question 1. Obtain lower bounds on multiplicative orders of Gauss periods of 
type (n, 2) with composite r = 2n+ 1, see [2] (rather than for their analogues (1). 
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Question 2. Obtain lower bounds on multiplicative orders of Gauss periods and 
their analogues of type (n, k) for k > 2, which can be defined similarly, see [1,2,3], 
[4,5,6,7,8], 

Question 3. Find a fast explicit construction of elements of exponentially large 
multiplicative order which is applicable to all finite fields. 

The last question is probably quite difficult in its full generality, but the 
results of [9] and of this work indicate that it may be feasible for large fields of 
small characteristic. 
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Abstract. In this paper, we consider a recent technique of Levenshtein 
[9] which was introduced to prove improved lower bounds on aperiodic 
correlation of sequence families over the complex roots of unity. 

We first give a new proof of the Welch [12] lower bound on the periodic 
correlation of complex roots of unity sequences using a modification of 
Levenshtein’s technique. This improves the Welch lower bound in the 
case that the family size is “large enough” compared to sequence length. 
Our main result is an improved lower bound on the periodic complex val- 
ued sequences with QAM-type alphabets. This result improves an earlier 
result by Bozta§ [2]. To achieve this result, we extend the Levenshtein 
technique in a new direction. 

Here, we assume that sequences are drawn from energy “shells” which 
are no more than PSK signal sets with the scaling determining the signal 
energy. It turns out that, if the weights are associated with the energy 
of the sequence in question, the Levenshtein technique can be modified 
to obtain lower bounds. 

The new bound is a non-convex function of a set of suitably chosen 
“weights.” We demonstrate that our results improve those of [2] with a 
concrete example. 



1 Introduction 

Let be the set of vectors of length n over the alphabet 

= I e = e2"/-}. 

Throughout this paper m is an arbitrary positive integer satisfying m > 2. For 
any x = (xo, xi, ..., x„_i) G FI” and y = (j/o, 2/i, 2/n-i) G FI” the periodic 

crosscorrelation function 9{x, y; 1) is defined as follows: 

n— 1 

0{x,y;l) = '^xjyj:^i, I = 0,1, ...,n - 1 (1) 

j=o 

* Supported by an RMIT Faculty of Applied Science Grant. 
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where z denotes the complex conjugate of z, and the addition in the subscript of 
Uj+i is modulo n. Every code C C is characterized by the maximum periodic 
crosscorrelation, maximum periodic nontrivial autocorrelation and maximum 
periodic correlation magnitudes respectively: 

9c(C) = max{\9{x,y;l)\ : x,y gC, x ^ y,l = 0,1, ...,n- 1}, 

9a{C) = max{|0(a;, x;l)\ : x G C,l = 1, ...,n — 1}, 
9{C)=ma^{9a{C),9,iC)}. 

The first lower bound on 9{C) is due to Welch [12] and applies to arbitrary 
complex sequences with constant energy, and hence to C C EJ^. In [12] a lower 
bound on 9a(C) for such sequences was also derived. Another important lower 
bound on 9{C) for C C EJ^ is that of Sidelnikov [11]. 

The class of sequence families which are subsets of E^, for some m are by far 
the best understood from the point of view of sequence design. For asynchronous 
CDMA applications, the case of interest is usually M > n? , where M is the 
total number of sequences, which allows > n simultaneous users. For m = 2 
and n = 2* — 1, A: > 3, and k odd, the sequence family due to Gold [5] with 
M = n(n + 2), is optimal with 9{C) = 1+ \/2(n + 1). For m = 4 and n = 2* — 2, 
k > 3, k odd, the so-called Family A with M = n(n -I- 2) (see [1] and the 
references therein, as well as [4] for its connections to ^ 4 -linearity) is optimal 
with 9{C) = 1 -|- v^rT+T. For m = p, an odd prime, the family due to Kumar and 
Moreno with M = n{n+l) is optimal [8] with n = — 1, and 9{C) = 1 -I- ^/n + 1 . 

This concludes our brief survey of important sequence families in El^ from the 
asynchronous CDMA point of view. See [6], [7] and [10] for more details. 

Recently, Levenshtein [9] introduced a new method which improves the aperiodic 
Welch bound from [12] in the case of sequences in E 2 . We have subsequently 
shown [3] that Levenshtein’s method works for sequences in E'^, for any m >2. 
To address the aperiodic case, Levensthein introduced “weights” for shifts of 
“zero-padded” code words. A suitable choice of weights allowed him to substan- 
tially improve Welch’s bound for most cases of interest. 

In the next section, we demonstrate that the original Welch lower bound on 
periodic correlation can be obtained by Levenshtein’s method when the weights 
are uniform. We then modify Levenshtein’s method so that it can be applied 
to the periodic correlation sequences over more general alphabets which can be 
thought of as the disjoint union of scalings of E'^. This gives us the freedom 
to obtain QAM type of alphabets by modifying the scalings of This will 
become clearer in the sequel. We make extensive use of the machinery introduced 
in [9] and believe that this generalization would not be possible without this 
machinery. 

2 Deriving the Welch Lower Bound on 

We will derive the Welch lower bound by considering a code C C E"^ to be made 
up of all the codewords and all their distinct cyclic shifts. Sometimes, it will be 
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convenient to associate an ‘exponent’ vector x € Z” with the corresponding 
vector ^(x) € given by ^(x) = ((P(xi), . . . , (P(x„)) <P : Zm ^ E, <P{u) 

For any two subsets A and B of E^^ define the value 

xGA yeB 



where {u, v) denotes the complex inner product of vectors u and v. The desired 
results for a code C C E'!^ will be obtained with the help of finding lower and 
upper bounds on F{C, C). 



Lemma 1. For any code C C if” of size M, 



6»2(C) > 



MF(C, C) - 
M- 1 



( 3 ) 



Proof. Considering separately in (2) the cases x = y, (in this case | {x, y) \ = n), 
and X ^ y we get 



M^F{C, C) < Mn^ + M{M - 1) 6»^(C), 



which gives the result required. 

We now prove a Lemma which is going to be crucial in the rest of this paper. 
Lemma 2. Let U be the unit circle in the set of complex numbers. For any 

X e C/”, 

F{{x},Ef^)=n 

Proof. Using (2) we have 

F{{x},Ef^) = ^ \{x,y)\^ = ^Fn,m = n 

m ^ ^ m 



provided we can show that 



EL,m-= Y = 

for any x G Ef^. Note that if we define 'F{v) = (<?('Ci), . . . ,<P{vn-i)) = {xi,. . . , 
Xn-i), and (p{z) = (^(^i), • . . ,<P{zn-i)) = {yi+i,- ■ - ,yn), then we can write 

EL,ni= Y^ I P) <P{v)gU^ arbitrary. 

<f(z)e£;4 

Here, Zk G Z^ for fc = 1, . . . , L, while Vk G [0, m) for fc = 1, . . . , L. We also define 
Sk = Vkjra, which allows us to write The sum turns out to 

be independent of <I>{v) G , and is evaluated below. 

EL,m= Y (^(^))^(^))= Y 

#(z)G£4 
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L 

.1=1 

^2ni{sk-si) ^zi-Zk 

l<k^l<L 

separating the two sums and interchanging the order of summation in the second 
sum now gives 

= Lm^ + g27Ti(s,-S,) ^Zl-Zk 

and the inner sum can easily be seen to be zero, which completes the 

proof. 

An immediate result of this Lemma is that 



= E 



E Ee^ 



0<2i .•••,2i;,<m— 1 \k—l 



Lemma 3. For any code C C E^, 



E{C,C)>F{E^,E^)=n. 



Proof. Define functions i,j = 0, 1, ...,2n — 2, on A” as follows. If A = 

{xq,xi, ...,Xn-i), then fij{X) = XiXj. Using this notation one can rewrite (2) 
in the following form: 

' " ' xeAYeB 

Let us verify (as in [9]) that for any two subsets A and B of Ef^, 

{E{A,B)f <F{A,A)-F{B,B). (4) 



First change the order of summation and use the Cauchy inequality for codes: 



2n-2 

E E E kAX)hAY) 

XgAYgB 



2n—2 

E E f^AY) 

i,j=0XGA YeB 



VI 


E f^Ax) 

XgA 


2 2n-2 

•E 

i,j^0 


E f^AY) 

ygb 


2n— 2 




2n- 


-2 



= E E E fBj(Y) ■ E E fBj(Y). 

i,j=0XGA YgA i,j=0XGB YgB 
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One more change in the order of summation implies (4). Then we note that for 
any A C £1” we have F{A,E'^) = by Lemma 2. Therefore the use 

of (4) with A = C and B = EJ^ completes the proof. 

We remark that using E{A, B) := eAJ2yeB /\MB\ gives a bound 

on 0"^{C). This bound is included in the corollary below: 

Corollary 1. (New ‘Welch’ Lower Bounds) We have 

0^{C) > ^ ) 6»‘‘(C) > 3n^ -2n- (n"‘/M). 

Note that M = ntii there are t cyclically distinct sequences in the code. The new 
‘Welch’ bound on 9‘^{C) will be tighter than the original Welch bound for 9'^{C) 
if Mri^ — 2Mn > 0. where the ratio of the two bounds will be (3/2)^/"^ « 1.107 as 
M and n tend to infinity. The corresponding two cases of Welch’s lower bounds 
are reproduced here for ease of comparison. 



Theorem 1. (Welch) We have 



9^(C) > 



Mn — ri^ 
M- 1 ’ 



9\C) > 



2Mn^ — rA 
M - 1 



3 The Main Results 

We first restrict ourselves to the case of QAM signals with two “energy shells” 
for ease of exposition. 



3.1 The Lower Bound for Two Shells 

In this section, we consider a signal alphabet of the form 

A = u ^ ei ,^ (5) 

where the union is assumed to be disjoint, i.e., a codeword belongs either to 
A” ^ , or to E!^^ but not to, say, Em^ x E^^ x . . . x • As an example, if we 
take A = ifl U •\/2i?|, the codewords (—1 , +i , — i), -\/2(+l , — * , — 1) 
are allowed while the codeword (—1 , \/2 , \/2i) is not allowed. To any vector 
X S E'^^ U "''’6 assign a weight Wc{x), where 

Wk > 0, k = 0, 1, and wq + = 1, (6) 

where c{x), is the “class” of the vector x, i.e., 

(wq if X e A” ^ , 

Wc(x~) = \ (7) 

\wi if X e VaEl^^. 
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For any two subsets A and B of A, we define the value 



F(A,B) 



1 

Mb\ 



EE-c( x)Wc(y) \{x,y)\^ , 
xeA y&B 



(8) 



where {u, v) denotes the complex inner product of vectors u and v. It is also 
convenient to define an “unnormalized F{A,By^ via G{A,B) := \A\\B\F{A, B). 
We now proceed to prove the generalized versions of the results in the previous 
section. 



Lemma 4. For any code C C X of size M , 

M^F{C, C) - {wl Mo + wj Mia2)n2 



r(C) > 



{wo Mo + wi MiY — (wqMo + w\Mi) 



(9) 



where 



C = Co U \faC \ , and 



|C,| = M„CiCC” 



t = 0,1 



with M = Ml + M 2 . 



Proof. Considering separately the cases x = y (in this case | (x, y) | = n for 
X G Co, and | (x,y) | = an for x G Ci, ), and x yf j/ (if x and y are in the same 
“class” we get the upper bounds Mi{Mi — l)wf9‘^{C),i = 0, 1, while if they are 
in different classes we get the upper bound 2MoMiwowi9^{C)) we obtain 

G(C, C) = M^F{C, C) < {wlMo + wlMia^)n^ 

+ {Ml - Mo)wle^{C) + {Ml - Mi)wle^{C) 

+ 2MoMiwowi0^{C) 

which gives the result required. 

Lemma 5. We have 

n Yfwl + (1 - j)woWia] 
n [ywo^ia + (1 — 



F{{x},X) = 



ifxG 

if xG , 



Proof. Note that 

G{{x},X) = G{{x},E:y^ U = 

= Wc{x)Wo ^ \{x,y)\'^ + Wc(x)Wia 






which gives 



G{{x},X) = 



^oCi,mo 



WoWiaFn.mi 

woWiaFn ,mo + w\afFn,n 






if X G Eil 



if X G Ef, 
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and yields the result required after using Lemma 2 (which states that 

= nmf,i = 0, 1) and noting that F{{x},X) = G{{x},X)/{mQ + nii) . 

Now, we can investigate the conditions under which 

F{C,C)>F{X,X) = F{El^ U U 

can be made to hold. 

Lemma 6. 

F{C, X) = n[6 (yrcg + (1 — 'y)wowia) + (1 — (5) {'jwowia + (1 — 7 )^ 10 ^)] 

Proof. Note that for any code C = Cq U yfiCi which is a subset of Ef^^ U y/aEf^^ 
we have 



G{Co U V^Gi,X)=G{Go U U V^E^J = 

= n [MqTOqICq + Mirnfwowia + MorriQWowia + Mimfwfa^ 
which gives 



F{Go U ^^C'l,A’) = 



G{Co U V^Gi,X) 



(Mo + Mi)(mo + m'f) 
= n [ 7 ( 15^0 + (1 — S)woWia) + (1 — 7)(5'u;o^yia + (1 ~ 
where S = Mo/(Mq + Mi). 

We now apply Lemma 6 to C = df to obtain 



F(X, X) = n [ 7 W 0 + (1 — 7 )awi]^ 



while 



F(G, X) = n\5 (yiCo + (1 ~ 7 )w^oWia) + (1 — 5) (yico^fia + (1 ~ l)w\a?y\ 
hence the choice 5 = 7 , is sufficient to ensure 



E{C,X) > E{X,X) 



holds (in fact, E{G,X) = F{X,X) holds here). We now appeal to an obvious 
weighted generalization of Lemma 3 to obtain that for the case (5 = 7 , we still 
have 



F{G,Xf <F{X,X)F{G,G) 



which now becomes 



which implies 



F{X, X)^ < E{X, X)E{G, G) 
E{X,X) <F{G,G). 



Hence, for such codes whose codeword distributions across energy “shells” (i.e., 
Mo/(Mq + Ml) ) are equal to the word distributions for the total codeword 
space, (i.e., mo/(mo + to")) we have the following: 
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Theorem 2. When S = we have 

M'^ F{X ,X) — (wq Mq + wf Mio?)n^ 



9^{C) > max 



o<iu<i (wo Mo + w\ M\Y — {wqMq + w\M\) 



Mn{'-fw + (1 — 7)0(1 — w))^ — {w‘^j + (1 — w)^a^(l — 7))/ 
M{-yw + (1 — 7)(1 — tu ))2 — + (1 — tu)^(l — 7)) 

Proof. Let wq = w, w\ = 1 — w, and use Lemma 4. 



max 

0<tu<l 



3.2 The General Lower Bound 



Here we consider a signal set with an arbitrary number, say v, of energy “shells.” 
Let the signal alphabet be of the form 

df = \/aoFmg U ... U (10) 

where the union is assumed to be disjoint, i.e., a codeword belongs to a unique 
. , for some i = 0,...,u — 1. To any vector x G U y/aEf;^^ , we assign 
a weight Wc(x), where 

Wk >0, k = 0,1, . . . ,v — 1 and tco + • • • + Wy-i = 1, (11) 



where c{x), is the “class” of the vector x (compare Section 3.1). The definitions 
of F{A,B) and G{A,B) are as in Section 1. The quantities Mi, 7 ^, Si, 
for i = 0, . . . , u — 1, all have the same meanings as in Section 3.1. The arguments 
to prove Theorem 4 parallel those in Section 3.1 and are omitted. 

Theorem 3. By choosing St = ji, i = 0, . . . ,v — 1, we have 



9^{C) > 



max 

+ + = l 

lOi >0,2 = 0, 



MiEto 



9 — 1 

« (Ei=o 



- (e:jo 



Wj li of) 



4 Discussion and Conclusions 



Note that all the results we have derived apply to constellations which can be ob- 
tained arbitrary rotations of the shells relative to each other due to the generality 
of Lemma 3. The family of bounds we have obtained are in general non-convex as 
a function of (wq, . . . , Wy-i). We now state a special case of a previously proved 
theorem [2] on CDMA sequences with arbitrary energy distribution. 

Theorem 4. (Bozta§) For any complex sequence family C with arbitrary en- 
ergy distribution, length n and M cyclically distinct sequences x\, . . . ,xm , we 
have the lower bound 



6»2 > /(Mn-1) 



/v—1 



,7*0* 






\i—0 



where is the average of the energy moment of the sequence family. 
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lm{z) 



( 0 , 1 ) 



(- 1 , 0 ) 



(1,0) 



Re(2:) 



( 0 ,- 1 ) 



Fig. 1. A two shell constellation corresponding to U y/2E'^ 



We now consider an example which uses the symbol constellation of Figure 1. 

Example 1. Let 7* = 1/2, = 4,z = 0,1, oo = l,ai = 2, and M = nt, i.e., 

t cyclically distinct sequences each of length n. Then the bound in Theorem 3 
becomes 0^ > n{j — j^)- However, the new bound in Theorem 2 becomes 

~ ^ {w+ { 1 - w))^ - 5 (ry2 + (1 - w)^) 

which is a ratio of two quadratics. Simply letting w ^ 0 yields 
6'^ > [4Mn — 8n^] /{M — 2) = [4n^(t — 2)] /{nt — 2) « 4n [(t — 2)/t] which is 
clearly better for typical values t > n of practical interest and tends to 4n as 
t 00. 



5 Conclusions 



In this paper, we have developed a new lower bound for the periodic correlation 
of sequence families. It must be emphasized that the new bound, while tighter 
than the previously derived lower [2] bound on QAM sequences, applies to a more 
restricted type of sequence, i.e., sequences which are restricted to have symbols 
in one of the possible “shells” of the constellation. Table 1 compares various 
performance parameters of different constellations, for the case that M « n^, 
which is of most interest for asynchronous CDMA applications. Note that it is 
possible to use the constellation E2 U \f2E2 and obtain improved minimum 
Euclidean distance at the cost of degraded Peak-Off-Peak correlation ratio over 
8-PSK. The challenge is to apply algebraic design techniques to design families 
that approach the bounds derived here. 
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Table 1. Performance parameters of various constellations for M k, n? . The 
symbol * means that a known sequence family achieves the given value. 



Sequence Type 


Data Rate 
(bits/symbol) 


Minimum Peak to 
Off-Peak Ratio ^/Iie{C) 


Minimum Euclidean 
Distance d 


4-PSK 


2 


= Vn* 

Vn V 


2 sin(7r/4) = \/2 


8-PSK 


3 




2sin(7r/8) « 0.765 


Er U y/2E2 


3 


- 7^ = 


1 
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Abstract. A general equation is given for the size of complex constel- 
lations constructed from the direct sum of PSK-like constellation prim- 
itives. The equation uses a generating function whose numerator is a 
power of a ’coordination polynomial’. Conjectures are also given as to 
the form and value of these coordination polynomials for various PSK. 
The study has relevance to error-coding, polynomial residue number the- 
ory, and the analysis of random walks. 



1 Introduction 

Communications systems often transmit data by modulating using Binary or 
Quaternary Phase Shift Keyed (BPSK or QPSK) or Quadrature Amplitude 
Modulated (QAM) constellations in the complex plane. But larger constellations 
can be more bandwidth-efficient and lead to efficient hardware implementation 
of complex arithmetic and algorithms [1,2]. 

This paper considers the problem of finding the size of constellations con- 
structed from direct sums of {PSK plus the origin}, referred to here as ’PSK0’ 
constellations. These constellations form lattices for 1,2,3, or 6 PSK primitives, 
but for any other PSK0 there will be residue ’folding’ making the determination 
of constellation size more complicated. This problem can be recast, for mPSK0, 
as finding an expression for the number of non-identical polynomial residues re- 
sulting from the reduction, mod <Prn{x), of polynomials in x of Coefficient Weight 
< n, (for some positive integer, n), and degree < m, where is the 

cyclotomic polynomial in x. Although residue folding is, for many applications, 
undesirable, it is hoped that an algebraic understanding of PSK0 will help in 
the construction of constellations more suited to communications systems which 
use PSK0 as building blocks. Also, from an algebraic point of view, it is useful to 
be able to enumerate the residues of polynomials, mod <Prn{x). The theorem and 
conjectures to be presented here are based on computational results. During the 
course of the work integer sequences, relating to the 8PSK0 and 16PSK0 con- 
stellations, were entered into Sloane’s On-Line Encyclopedia of Integer Sequences 
[3] and were found to refer, in particular, to the paper by Conway and Sloane on 
Low Dimensional Lattices [4] which, in turn, references work by O’Keefe [5] and 

* This work was funded by NFR Project Number 119390/431. 
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others [6] . Their results have applications to crystallography, and use generating 
functions which require the specification of a ’Coordination Sequence’. This pa- 
per conjectures a general solution to a related problem, although a general form 
for the Coordination Sequence (Polynomial) has yet to be found. The results 
could be used to help extend the scope of error coding strategies such as [7,8], 
and may also be useful for the development of ’Random Walk’ statistics. 

2 Statement of the Problem 

Define mPSK-|- as the set of m -I- 1 points in the complex plane given by, 
mPSK-|- = {0, , . . . , 

where w = e~ , and = —1. Define mPSK0n as the direct sum of n copies of 
mPSK-|-, given by, 

n—1 

mPSK 0 n = ^{0, l,w,w'^ , . . . , w™~^} 
fc=0 

We wish to find a formula for as n varies over the positive integers, where dn 
is the number of non-identical points in mPSK0n, given by, 

n — 1 

dn= |^{0,l,u;,w2,... .ic”""^}] 
fc— 0 

For instance, let m = 4. The kernel constellation is {0, 1, ic, w^}, where 
w = , and, 

1 

c?2 = I ^^{0; 1, w, w^}| = |{0, ±1, ±ru, ±l±rc,±l0r(;, ±2, ±2ru}| = 13 

k=0 

As another example, for m = 6 and n = 2, 

d 2 = |{0, ±1, ±rc, ±2, ±2w, ±2w^, ±1 ± w, ±1 0 w“^,±w 0 = 19 

An algebraic description of the same problem is as follows. 

Definition 1 The ’Coefficient Weight’, (cw), of a polynomial, f{x), is the sum 
of it’s coefficient values. In other words cw{f{x)) = /(I). 

Let g{x) = Y^iPix’. Let, 

Gm,n = I 0 < deg( 5 (x)) <m,gi>0 Vz, 0 < cw{g{x)) < n} 
where deg(a(x)) is the degree of a(x). Let x = e^, where i^ = —1. Then, 
toPSK 0 n = {h(x) I h(x) = (g(x)),p^f',^^ ,Vg(x) € Gm.n} 

where (a)f, is the residue of a mod b, and <Prn{x) is the cyclotomic polyno- 
mial. Therefore, 

dn = jmPSK 0 n| 



as before. 
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3 Computational Results 

Tables 1 and 2 show some computed values of for various n and m. The 
number of Euclidean distances, D, refers to the size of the set of values for the 
absolute (straight-line) distance from each point in mPSK0n to the origin. The 
figures for D are not discussed further in this paper, but are included here for 
the reader’s interest. 



Table 1. Constellation and Euclidean Distance Enumerations for Various 
mPSK0n. 

d„-No of points in constellation. D-No of Euclidean distances. 




And here are a few more partial results for the case m = 8. 



Table 2. Constellation Enumerations for More 8PSK0n. 



13 

dn 



15 

d-n 



22569129961 



4 Some Conjectures 

We shall form a generating function for the sequences, dn, where is different 
for every m. Thus define dm{x) = dnX^ . The following conjecture satisfies 

all numerical results quoted above. 
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Conjecture 1 

- (1 _ ^)0(m) + l 

where (j> is Euler’s Totient Function, h is the square free part of m, and Ch{x) 
is referred to as the h^^ coordination polynomial. Ch{x) is palindromic and 
deg(c/i(a;)) = (j){h). 

The above conjecture omits to specify exactly the form of Ch{x). This is an area 
of further research. However the following theorem determines Ch{x) where h 
is a prime, and two following conjectures satisfy the computational results for 
h = 2p, p an odd prime, and ft. = 15, respectively. 



Theorem 1 



Cp(x) = d>p{x), p prime 



Theorem 1 was conjectured by the author based on numerical computation. A 
proof was found by T.Klpve and it is given in Appendix A. 

Conjecture 2 

P-3 P-1 

2 k 2 

C2p{x) = ^ ^ ( i ) + ^ i ) ’ p an odd prime 

k—0 i=0 i=0 



Conjecture 3 

c\e,{x) = (1 + x^) + 7(x + x"^) + 28(x^ + x®) + 79(x^ + x^) + 130a;^ 



The following observation was also made. 



Conjecture 4 



m 




From the computational results values of Ch(x) have also been partially as- 
certained for various ft as shown in Table 3. 

All preceding coordination polynomials were computed from the dn sequences 
using the following strategy. For instance, for m = 6 the dn sequence is computed 

to be 1,7,19,37,61,91,127,169, Thus4(a;) = l+7x+19x‘^+37x^+6lx'^+91x^+ 

127x® -I- 169a;^ -I- Note that ^(6) -I- 1 = 3 so, from Conjecture 1, we multiply 

de{x) (truncated to degree 7) by (1 — x)^ to get Cg(x) = e{x) -I- -I- 4a; -I- 1, 

where e{x) is some error term due to having truncated de{x) to degree 7. In 
this case e{x) = — 217x® -I- 380x® — 169x^°, which is evidently an error term so 
cq{x) = -I- 4 x -I- 1 . The same strategy can be used to compute Ch{x) for all 

sequences in the table, and hence arrive at the preceding Conjectures 2 - 3 on 
the form of Ch{x). 
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Table 3. Incomplete Coordination Polynomials for Various Composite h. 



h 


Ch{x) 


'n 

30 

33 

35 


1 + 9x + + 158a:'’ + 432x^ -k gOOx” -k . . . ? 

1 -k 22x + 208x^ + 874x® -k 1480x‘‘ + . . . ? 

1 + 13x + 91x^ + 444x^ + 1677x^ -k . . . ? 

1 -k 22x + 208x^ + 874x® -k 1480x‘‘ + . . . ? 



5 Triangle Patterns 



An examination of number triangles may give a clue as to how to extend the 
previous conjectures on coordination polynomials to the more general case. On 
page 14 of [4] it was observed that the coordination polynomials for the dual 
lattice, satisfy the following ’coordinator’ triangle. 

1 

1 1 
14 1 

15 5 1 

16 16 6 1 

1 7 22 22 7 1 

1 8 29 64 29 8 1 

1 9 37 93 93 37 9 1 

1 10 46 130 256 130 46 10 1 

1 11 56 176 386 386 176 56 11 1 

1 12 67 232 562 1024 562 232 67 12 1 



th • 

The line of the above triangle, p prime, also provides the coordination 
polynomials, C 2 p{x), for Conjectures 1 and 2 of this paper. 

In the same way we can construct a partial triangle for the c^p^x) case, using 
our previous computational results. Thus, 



15? 

1 6 21 ? 

1 7 28 79 130 

1 8 36 114 282 ? 

1 9 45 158 432 909 ? 

1 10 55 212 635 1499 ? ? 

1 11 66 277 902 2346 ? ? ? 

1 12 78 354 1245 3525 ? ? ? ? 

13 91 444 1677 5124 ? ? ? ? ? 



5 1 

21 6 1 

79 28 7 1 

282 114 36 8 1 

909 432 158 45 9 1 

? 1499 635 212 55 10 1 

? ? 2346 902 277 66 11 1 

? ? ? 3525 1245 354 78 12 1 

? ? ? ? 5124 1677 444 91 13 1 



where each entry apart from those of the middle three columns seems to 
be the sum of the three entries immediately above, e.g. 158 = 8 + 36 + 114. 
Note that the only triangle entries directly computed from compu- 
tational results are the sequences, 1,4,1, and 1,7,28,79,130,79,28,7,1, 
and 1,9,45,158,432,909, and 1,13,91,444,1677. All other numbers in 
the above triangle are nominally filled in to fit the ’sum of three’ con- 
jecture. The C 3 p(x) coordination polynomial can be read from the line of 
the previous triangle for p prime. For instance, Ci 5 (x) = (1 + a;®) + 7(x + + 

28(x^ + x®) + 79(x® + x®) + 130x‘*. Although we do not currently have an equation 
for C 3 p(x) it is worth noting that the following triangle is similar to the previous 
triangle. 
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212 

902 



36 114 282 

158 432 972 

635 1562 3273 

2409 5470 10385 



52 

175 

576 

1869 

6000 

19107 



79 

282 

972 

3273 

10385 



354 1245 3588 8781 18714 35412 60460 35412 



28 7 

114 36 

432 158 

1562 635 

5470 2346 

18714 8781 



212 

902 

3588 



277 66 

1245 354 



13 91 444 1677 5187 13614 31083 62907 114586 190333 114586 62907 31083 13614 5187 1677 444 91 13 1 



and satisfies the equation, 



r—1 

P^{x) = ^(3a;)'=(l + X + 
k=0 

for each line of the triangle, r, thus providing a clue as to the true form of C 3 p(x). 

Appendix B sketches out an alternative strategy for the rapid computation 
of the coefficients, d„, of dm{x) for general m, and is a good starting place for 
further research. It is hoped that the strategy of Appendix B may lead to a proof 
of the remaining conjectures, in particular Conjecture 1, and may also lead to a 
theorem for the construction of Ch{x) in the general case. 



6 Conclusion 

This paper has presented computational results relating to the size of constel- 
lations formed from direct sums of PSK-type constellations. A theorem and 
a number of conjectures have been offered, comprising formulae for the rapid 
computation of sizes of such ’direct-sum’ constellations. These formulae have 
application to error-control coding, random-walk statistics, algebraic number 
representations, and (polynomial) residue number theory. It remains to verify 
the conjectures. 



7 Appendix A- Proof of Theorem 1 by T. Klpve 

Let Pr{n) denote the number of ordered partitions of n into r parts, that is 
Pr{n) = |{(ai, 02 , ... , Or) G I oi -I- 02 -I- . . . -I- Or = n}| 
where = {0, 1, 2, . . . }. If, 



n = Ol -I- 02 -I- . . . -I- Or-l -I- Or 



and Oj > I for i = 1, 2, . . . , r — I, then 



n — (r — I) = (oi — I) -I- (o 2 — 1) -I- . . . -I- (Or-l — I) -I- Or 

and vice versa. Hence the number of ordered partitions of n into r parts such 
that the first r — I parts are positive is Pr{n — {r — 1)). 
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Lemma 1 We have 



^Pr{n)x'^ 

n=0 



1 

(1 - xY 



and 



OO 

^ Prin- {r- l))a;” 
n—r—1 



(1 - xY 



Proof of Lemma 1: These are standard results from the theory of partitions: 

OO - 

V Pr{n)x" = {1 + X + x'^ + x^ + . . .y = ^ 

^ (1 



and 



Er=r-iPr(n- (r - l))x” = x’' ^Er=r-iPr(n- (r-l))x” 



Lemma 2 Let m be an odd prime. Then d„ = Pm+i{n) — Pm+i{n — rn). 

Proof of Lemma 2: counts the number of distinct sums 

a\w + a 2 'w'^ + . . . + 

d-m W™ + Om+l ■ 0 (1) 



where > 0 for i = 1, 2, . . . , m + 1 and oi + 02 + . . . + Om+i = n. Noting that 
w + w'^ + .. .w'^ = 0 we get by counting all sums ( 1 ) , this number is Pm+ 
and subtracting the number of sums where a* > 1 for i = 1,2, ... ,m, this 
number is pm+i{n — m) (as explained above). 

■ 

Theorem 1 now follows from the two lemmas: 



Er=0 = Y.n=oPrn+l{n)x^ - Y.u=oP^+lij>- ~ m)x^ 



1 — x^ 



(1-x)” 



(1-x)™ 



since d>m{x) = x"‘ + x"^ ^ + . . . + 1. 



8 Appendix B - A General Strategy for Computing the 
Size of PSK0 Constellations 

Here a technique is proposed for the fast computation of the coefficients of dm{x) 
in the general case. Hopefully this may lead to a general proof of the conjectures 
of this paper, and a fast way to construct Ch{x) in the general case, at least for 
m up to some large value. The technique will be illustrated by looking at the 
case where m = 6. Note that T>q{x) = x^ — x + 1 . The steps of the technique are 
the following subsection headings. 
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8.1 Find all Forbidden Binary Patterns 

(Pq{x) implies the following polynomial equivalences: 

+ 1 = X pattern is 101000 

+ 1 = 0 pattern is 100100 

These are the two binary patterns (polynomials) which are ’forbidden’ for m = 
6. The forbidden polynomials are the set of polynomials which are equivalent, 
mod ^rn{x), to polynomials of lower hamming weight. Note that, for example, 
x‘^ — x+1 is not included as a ’forbidden’ polynomial as it includes the polynomial 
+ 1 as a sub-polynomial. In general, for m = 2p, p prime, there are only 
two forbidden polynomials, namely, x^~^ + x^~^ + x^~^ -I- ... -I- x^ -I- 1, and, 
x^-|-l. More generally, for large, composite m, there may be non-binary forbidden 
polynomials. 

8.2 Enumerate all Length m Binary Words which Avoid the 
Forbidden Patterns 

For m = 6, and for Hamming Weights (hw) 0-6 we have the following cyclically 
distinct binary strings which avoid the forbidden patterns or any cyclic shift of 
the forbidden patterns. 

hw = 0 000000 
hw = 1 100000 
h^ 2 110000 
hw = 3 none 
hw = 4 none 
hw = 5 none 
hw = 6 none 

Each string of non-zero Hamming Weight has cyclic shift order 6. We will refer to 
the set of length m strings which avoid the forbidden patterns as the ’foundation’ 
polynomials. These ’foundation’ polynomials form the set E. For m = 6 |E| = 3. 
We will define there to be ^ cyclically distinct length m binary words in E, 
0 < hw < m. For m = 6, eo,e = 1, epe = 1, 62,5 = 1, 63,5 = 0, 64,6 = 0, 65,6 = 0, 
ee,6 = 0- Note that eo,m = 1 Vm. 

8.3 Use Each Member of E as a ’Foundation’ for Building All 
Length m Inequivalent Polynomials of Coefficient Weight n, 
mod 

The ’1’ positions of the ’foundation’ polynomials of E mark the positions where 
we are allowed to add ’coefficient weight’ to construct our inequivalent poly- 
nomials. It therefore follows that the number of inequivalent polynomials, 
satisfies. 




(2) 
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For m = 6, 



do = 1 

di = 1 + 6 = 7 

d2 = 1 + 6 + 6(1 + 1) = 19 

ds = 1 + 6 + 6(1 + 1) + 6(1 + 2 + 0) = 37 

d4 = 1 + 6 + 6(1 + 1^ 6(1 + 2 + OH 6(1 + 3 + 0 + 0) = 61 

ds = 1 + 6 + 6(1 + 1) + 6(1 + 2 + 0) + 6(1 + 3 + 0 + 0) + 6(1 + 4 + 0 + 0 + 0) = 91 

. . . etc 

These numbers agree with those of Table 1. The number of r-way ordered 
partitions adding to n is Pr(ji), and 



n + r — 1 



Pr(n) = 



Therefore we can rewrite (2) in terms of partitions as, 



d„ = 1 + m ^ Phw(* - ^w)ehw,r 
fc=ihw=i 



( 3 ) 



8.4 Comments on the Technique 

The technique assumes that all polynomials in E have cyclic order m. It seems 
likely that this is true in general as dn appears to satisfy m|(d„ — 1) for all cases 
computed in Tables 1 and 2. A proof of Conjecture 1, and a proof of the general 
form of Ch{x) may well follow if one can do the following for a given m, 

1. Derive an efficient method to compute the ’forbidden’ polynomials. 

2. Derive an efficient method to compute the elements ^ of E from the 
forbidden polynomials. 

For large m (e.g. perhaps m = 105?) there may be non-binary forbidden 
polynomials for which the above technique must be modified as follows: Consider, 
as an example, a ’hypothetical’ forbidden polynomial, F{x), of the following 
form: 

F{x) = X® + 3x^ + X + 2 

Then it has an associated binary forbidden polynomial, /(x), where, 

/(x) = X® + X^ + X + 1 

We wish to disallow all polynomials built from the foundation F{x) not /(x). 
Let the cyclic order (over m) of F(x) and /(x) be v. Then we should include 
polynomials in our count for dn, where 

n n— 3 n 

7n = i’(^P4(A: -4) - ^P4(fc -4)) = V Y Pi{k - ■i) 

k—1 k—1 k—n—2 
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where the ’3’ in the summation limit of the previous equation is the coefficient 
weight (cw) of F(x) minus the hamming weight of F(x). In general, for a given 
forbidden polynomial F{x) we include 7 n in our count for where 7 „ satisfies, 

n 

ln = v ^ Phw(F(x))(^ - hw{F{x))) 

k=n+hw{F{x))-CW{F{x)) + l 

In the case where the forbidden polynomial is a binary polynomial hw(_F(x)) = 
cw{F{x)) and 7 „ for F{x) is 0, as expected. Things will be further complicated 
if the cyclic order of F{x) is lower than that of f{x). 
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Abstract. In this paper, we present a new approach for computing nor- 
mal forms in the quotient algebra A of a polynomial ring R by an ideal 
I. It is based on a criterion, which gives a necessary and sufficient con- 
dition for a projection onto a set of polynomials, to be a normal form 
modulo the ideal I. This criterion does not require any monomial order- 
ing and generalizes the Buchberger criterion of S-polynomials. It leads to 
a new algorithm for constructing the multiplicative structure of a zero- 
dimensional algebra. Described in terms of intrinsic operations on vector 
spaces in the ring of polynomials, this algorithm extends naturally to 
Laurent polynomials. 



1 Introduction 

Solving polynomial equations is ubiquitous in many domains like robotics, com- 
puter vision, signal processing, . . . One of the algebraic approaches for treating 
safely this problem stands with constructing a normal form function in the quo- 
tient algebra A of the ring of polynomial R by the ideal I generated by the 
polynomials that one wants to solve. Once this normal form is known, one can 
solve the polynomial system, either by using symbolic methods like elimination 
of variables [7], [10], [9], rational representation [14] [p. 88], [20], [1], [12], [21] 

. . . and solving a univariate polynomial equation or by matrix manipulations 
and eigenvalue or eigenvector computations [13], [2], [15], [16], . ..In practice, 
many of the polynomial systems are given with approximate coefficients, which 
means that we have to consider not just a single polynomial system but its 
neighborhood. Despite this, the aforementioned methods proceed pointwise and 
do not take into account the continuity that exists in this neighborhood. 

Our motivation here is to develop a method which can exploit as much as 
possible, the continuity in the coefficients of the input system. It is well-known 
that the geometry of the solutions is not a global continuous function of the 
parameters of a polynomial system. In some specific configurations, by a small 
perturbation of the coefficients, we can for instance blow-up points into curves. 
However, in many practical situations such as in robotics, in computer vision, 
. . . , the geometry of the solutions is not changing in a neighborhood of the 
input system. We are interested in algorithms, which can take into account 
this continuity in a given neighborhood. From a practical point of view, such a 
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requirement is necessary for developing safe and numerically stable methods for 
solving polynomial equations with approximate coefficients. 

As is illustrated below, one of the customary tools used nowadays to compute 
normal forms (ie. Grobner bases) is not very adapted to this situation. Consider 
for instance a system of equations in two variables of the form 



J Pi := ax\ + hx\ + /i(a;i,a;2) = 0 

\p2 := cxl + dxl + l2i.xi,X2) = 0 



where a,h,c,d € C are complex numbers, ad — bc^Q and /i, I2 are linear forms. 
Form a Grobner basis of these polynomials for a monomial order refining the 
degree order. The initial ideal is generated by {xl,X2) and the corresponding 
basis oi A= €-[xi,X2]/{pi,P2) is {l,xi,a;2,a;i X2}. 




Fig. 1. The two conics with horizontal and vertical axis and the basis 
(l,Xi,X2,xi X2) of ^ = C[xi,X2]/(pi,P2), deduced from a Grobner basisxs com- 
putation for a monomial ordering refining the degree ordering. 



Consider now a small perturbation of this system 



f Pi = Pi + £l Xi X2 
\p2 =P2 + £ 2 X 1 X 2 , 



where £1,62 C C are “small” parameters. The zero-set is also the points of 
intersection of two conics, which are slightly deformed. 




Fig. 2. The perturbed conics and the basis {l,xi,X2,X2} of ^ = 

C[xi, X2]/(pi,P2), deduced from a Grobner basis computation for the same mono- 
mial ordering. 
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In the result of a small perturbation, basis may “jump” from one set of mono- 
mials to another, though the two situations are very closed to each other from a 
geometric point of view. Moreover, some of the polynomials of the Grobner basis 
have large coefficients, for we have to divide by coefficients of the order ei , £2 • 
Thus, this computation has to be carried out with exact and multi-precision 
arithmetic, inducing an unnecessary additional cost in the resolution process. 

As we can see from this example, Grobner basis computations may introduce 
artificial discontinuities, due to the choice of a monomial order. Indeed, comput- 
ing a Grobner basis for a given monomial order only allows “flat” deformations 
of the initial ideal of the input system (see [8]). In this work, we want to consider 
more geometric deformations, by removing the constraints induced by monomial 
orders. We present a new method, for computing the normal form with respect 
to a set of polynomials, provided that this set of polynomials satisfies a prop- 
erty of connectivity^. This construction is based on a new criterion, which gives 
a necessary and sufficient condition for a projection onto this set of polynomi- 
als, to be a normal form modulo the ideal / generated by the equations that 
we want to solve. This criterion generalizes the Buchberger criterion. Indeed, if 
the projection is compatible with a monomial order and if the basis is a set of 
monomials not in the initial of the ideal, we recover the 5'-polynomial criterion. 
Moreover, it extends naturally to Laurent polynomials, as we will see. This leads 
us to new algorithms for solving polynomial equations, which are described in 
intrinsic terms, involving operations on vector-spaces of polynomials instead of 
pairs of polynomials. Gonsequently, the structure of the objects involved in the 
computation can be handled more efficiently [18]. 

The paper is organized as follows. In section 2, we define the reduction onto a 
vector space B and the notion of connectivity. In section 3, we describe and prove 
the criterion, relating the canonicity of the B-reduction and the commutativity 
of the operators of multiplication. In section 4, we present a new algorithm, 
based on this criterion, for constructing the multiplicative structure of a zero- 
dimensional algebra. Its natural and direct generalization to Laurent polynomials 
is given in section 5. 



2 Definitions 

In this section, we introduce the notations that will be used hereafter. Let IK 
be a field. We denote by i? = K[a;i, . . . ,x„] = K[x] the ring of polynomials in 
the variables xi, . . . , with coefficients in K. For any a = (ai, . . . , a„) G N", 
we denote by x“ the monomial by jaj its degree deg(x“) = jaj = 

a\ + • • • + an and by (x)* the set of all monomials in the variables cci, . . . , 

In order to have “homogeneous” notations, we will write xq = 1. 

For any subset G C R, we denote by (G) the vector space generated by the 
elements of G and, by (G) the ideal generated by its element in R. 



This is the case, for instance, for the set of monomials which are not in initial of an 
ideal. 



1 
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Definition 2.1. For any linear subspace V of R = K[a:i, . . . , x„], we write 
V+ = V + Xi-V+---+Xn-V 

the vector space generated by the elements v, XiV for v G V,i = 1, . . . , n. 

If m C (x)* is a monomial subset of R, we also denote by m+ the union of m, 
xi • m, . . Xn ■ m, so that (m)+ = (m+). 

We denote by the d*^ power of the operator + on V. By 

convention, yM = V. The set = Lld>oV^‘^ is also the ideal generated by V 
and denoted by (V). 

Definition 2.2. For any polynomial p G R and any vector space B C R, the B- 
index of p is the minimum of d such that p G if p G (B) and — oo otherwise. 

The B-index of p corresponds to the ’’distance” between p and B: 




Next, we define the reduction of a polynomial on B, taking into account the 
“geometry” of B, i.e. the B-index of the polynomial: 

Lemma 2.3. Let B and K be vector spaces of R such that B + K = B+ . Then 
any element p o/bW of B-index d can be reduced by an element k G to 

an element rGB:p = r-\-k. The polynomial r is called a B-remainder of p 
along K. 

Proof. We prove it by induction on the B-index of a polynomial. The property 
is true for the elements in B^^l = B+, which are of B-index < 1 and which can 
be reduced by AT = to elements of B. 

Assume that the property is true for polynomials of B-index < d — 1 and let 
p be a polynomial of B-index d. Then p G B^‘^^ is of the form 

n 

p = ^ XiP,, 

with Pi G By induction hypothesis, as pi is of index < d — 1, there exist 

ri G B,ki G such that Pi = ri h, so that 

n n 

p = y^Xiri-Gy^Xjki=r'-Gk' 

i— 0 z— 0 

with r' = X^r=o ^ ~ X^r=o ^ By hypothesis r' = 

r-Gk” with r G B, k” G K. Therefore, p = r-\-k with r G B, k = k'-\-k" G 
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Remark 2. 4- If I £ B then ijW = R and any polynomial p G R can be B-reduced 
along K. 

If = (m), where m is a set of monomials containing 1, then is generated 
by monomials of the form x“ = x“ x“ with x“ G m and \a”\ < d. The B- 
index of x“ is the minimum of the degree of \a"\ in such a decomposition. We 
describe an algorithm of reduction, in this case. 

Algorithm 2.5 Reduction onto B = (m) with 1 g m C (x)*. 

For any monomial x“ of R-index d, 

1. Decompose x“ as x“ = a;ix“ with x“ G for an appropriate i. 

2. Compute the R-reduction r' G i? of x“ . 

3. Compute the R-reduction r £ B of Xir' £ R+ by projection of R+ on B along 
K. 

The polynomial r is a R-remainder of x“ along K . 

Note that the remainder r is not necessarily unique. This algorithm gives one of 
the possible R-remainder. 

Definition 2.6. The B -reduction along K is canonical modulo (K) iff for any 
n G N and any p £ R[”1, there exists a unique pair of r £ B and k £ 
such that p = r -\- k : RW = R 0 RW . 

We introduce now a notion which will be used extensively hereafter. It is a 
property of connectivity, which allows us to reach any element of a vector space 
from a single element, by multiplication by the variables Xi, i = 1, . . . ,n. 

Definition 2.7. Let B he a vector space of R. We say that B is connected to 
eo £ B, if for any b £ B, there exists &i, . . . , G R such that 

n 

& = ^ Xih, 

with hi a multiple of cq and deg(&i) < deg(6). 

A typical situation is when R is generated by a set of monomials m such that 
for any to G m, there exists i\, . . . ,ik with m = Xi^ ■ • ■ Xi^. and xq • • • a;*, G R, for 
I = 1, ... ,k. In other words, we can go from toq G m to any element to G m, by 
multiplication by a variable and staying in m. This situation occurs, for instance, 
with Grobner basis when m is the set of monomials not in the initial of an ideal / 
for a fixed monomial order. It is a basis of the quotient ring R/I (see Macaulay’s 
theorem [7]). In our approach we can imaging more complicated basis. 

3 Criterion 

In this section, we consider a vector space R connected to I and a projection N 
from R+ onto R. As we have seen in the previous section any polynomial p £ R 
can be R-reduced along the kernel of N . The problem that we address next is 
the following: 
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How to decide if the reduction is canonical, using the only “local” infor- 
mation that we have, ie. the projection N from onto B? 

Theorem 3.1. Let B be a vector space of R connected to 1. Let N : B~^ B 
be a K-linear map such that = I. Let I = (ker(A^)) be the ideal generated by 
the kernel of N. We define 



M^-.B-^B 

b N{xib). 



The two properties are equivalent: 

1. For all 1 < J < n, Mi o Mj = Mj o Mi. 

2. R = B® I. 

If this holds, the map B-reduction along ker(iV) is canonical. 

Theorem 3.1 will be proved in two steps: First, we describe the normalization 
N in terms of the operators Mi. For any a = (ai, . . . , o;„) G N”, we denote by 
M“ the operator o • • • o M“". Note that if the operators Mi commute, M“ 
is independent of the order in which we compose the operators. By linearity, 
for any p G R, we also define p(M), which is obtained by substitution of the 
variables Xi by the operators Mi in p. By convention, we will take = I. 

Proposition 3.2. Assume that the vector space B is connected to 1 and that 
the operators Mi commute. Then for any b G B^ , N{b) = 6(M)(1). 

Proof. We first prove by induction on the degree that for b G B, we have 
6(M)(1) = &. If 6 = 1, then we have 5(M)(1) = M°(l) = 1, which proves 
the property for polynomials in B of degree 0. Assume that the proposition is 
true for all b' G B oi degree < n. By hypothesis, for any polynomial b G B oi de- 
gree < n, there exists b\, . . . ,bn G B such that b = X^r=i with deg(6i) < n. 
Then by induction hypothesis, we have 

( n \ n 

(M)(l) =Y^MiO 6,(M)(1) 

n n n 

= Y, M,{h) = Y b,) = N(Y b,) = N{b) = b, 

2=1 2=1 2=1 

for iV = I on B. 

We deduce now the property for the elements of the form b = Xib' with 
b' G B. By definition, 

N{x^ b') = M,{b') = M,(6'(M))(1) = 6(M)(1). 

By linearity, it extends to B~^. 

Secondly, we describe the ideal / generated by ker(fV) in terms of the oper- 
ators Mi. 
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Proposition 3.3. Assume that the vector space B is connected to 1 and that 
the operators Mi commute. Then the ideal I = (ker(A^)) is 

- (p(x) -p(M)(l))pgK, 

— {p& R st. p(M)(l) = 0}. 

Proof. For any p G R, let a(p) = p(M)(l) G B and let J = (p(x) — a{p))p^R. 
We first prove that J coincide with the ideal J' = {p G R st. p(M)(l) = 0}. 
According to proposition 3.2, we have 

(p(x) - ct(p))(M)(1) = a{p) - cr(p)(M)(l) = a{p) - a{p) = 0, 

which proves that J C J'. 

Conversely, for any p G J', such that a{p) = p(M)(l) = 0, we have 

p(x) = p(x) - a{p) G J, 

which proves that J = J' is an ideal of R. 

Let us show now that J = I.AsNoN = N, the kernel ker(A^) is generated 
by the polynomials b — N{b), b G S'*". Therefore according to proposition 3.2, 
/ = (ker(A^)) is generated by the polynomials b— 6(M)(1), for b G S'*", which 
implies that I C J. 

Let us prove conversely by induction on the degree, that J <Z I. For any 
a yf 0 G N", there exists i G N and a' G N" such that x“ = XiX“ . We have 

x“ - M“(l) = cc*(x“' - M“'(l)) + x*M“'(l) - M* o M“'(l) 

By induction, the element (x“ — M“ (1)) G J is also in I. Moreover denoting 
ba' = we have Xiba' — Mi{ba') = Xiba' — N{xiba>) G ker(fV) C I, which 

proves that x“ — M“(l) G I. By linearity, this implies that J C I, and I = J. 

Proof of Theorem 3.1. 

2=^ 1. If = then B C\ I = {0}. As for any b G B, we have Mi{b) = Xib 

modulo /, we deduce that for any 1 < i, j < n, we have 

{Mi o Mj — Mj o Mi){b) = {xiXj — XjXi)b = 0 

modulo I. But we also have {Mi o Mj — Mj o Mi){b) G {B). Therefore, {Mj o 
Mj - Mj o Mi){b) = 0. 

1 2. Consider the map 



a : R ^ B 

p^p{M){l) 



which is surjective, according to proposition 3.2: V6 G B, a{b) = b. By proposition 
3.3, its kernel is the ideal / = (ker(A^)), which proves that B is isomorphic to 
R/I and R = B (B I. □ 
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Remark 3.4- The hypothesis that B is connected to 1 is necessary in the proof 
of the theorem. Consider, for instance, the case where B = {l,x,x^} and N 
defined on B^ = (1, x, . . . , x^) by iV(l) = 1, 7V(x) = x, A^(x^) = 0, fV(x^) = x^, 
A^(x^) = x^. Then, the ideal generated by ker(A^) is / = (x^). There is only one 
operator Mi, so that the hypothesis of commutation is obviously satisfied, but 
{l,x,x^} is not a basis of R/I. B O I = (x^). 



Remark 3.5. Criterion 3.1 generalizes the Buchberger criterion [5], [6], [3], [7], 
which says that a set of polynomials {gi, . . . , gs} is a Grobner basis for a given 
monomial order if for any pair 1 < , i 2 < s, the S'-polynomial of (/i ^ , gt^ reduces 

to 0 by G. Let niij, (resp. mi^) be the leading term of gi^ (resp. m = 
lcm{mij^,mi^), Xqj (resp. x^a) a variable dividing Wij (resp. with «i yf « 2 - 
Thus we have m = x^Xq-jw'. Let us consider a pair (zi,i 2 ) such that m' is not 
in the initial (Buchberger’s criterion 3) of G. For any polynomials p € R, let 
N{p) be the remainder of p in the reduction by G. Then we can check that 

(Mai O Tfaa - O Ma^){m') 

is the remainder of the S'-polynomial 



S(5ii,ff.J = m(^-^), 

mil 

by reduction by G. 

4 Computing the Multiplicative Structure of 
Zero-Dimensional Algebra 

In this section, we describe a new algorithm for computing the multiplication 
table of a quotient algebra of dimension 0, based on the criterion of theorem 
3.1. More precisely, we compute the normalization from S'*' to B, which yields 
directly the multiplication tables by the variables Xi , and consequently the roots 
of the system by eigenvector computations [2], [15], [16]. 

Compared with Grobner basis computations, instead of considering pairwise 
reductions, we manipulate coefficient matrices of polynomials, offering a global 
view of their structure. This enables us to exploit this structure involved in the 
computation, like sparsity [4], quasi-Toeplitz structure [18], [17], . . . and it leads 
to algorithms for polynomials with approximate coefficients whose numerical 
stability can be analyzed more clearly. Similar ideas are also used in [11], where 
an extension of Buchberger algorithm using linear algebra tools is proposed. 

Let fi, . . . , fm & R he m polynomials and L a finite vector space connected 
to 1 and containing Let / be the ideal of R generated by /i, . . . , /„. 

We assume here that A = R/I is zero-dimensional. Let us define by induction 
the following vector spaces: 

- Ko={fl,...,fm), 

- Kn +1 = K+ n L, n > 0. 




438 



B. Mourrain 



Since L is a finite vector space, the increasing sequence Kn is stationary. We 
denote by if* the union of all the vector spaces if„. 

By construction, we have if* C I. We assume that 1 ^ if*, otherwise the 
ideal / = (/i, . . . , fm) is trivial. Then we have the following property: 

Proposition 4.1. Let B he a supplementary space of if* in L, connected to 1 
and assume that B'^ C L. Let N he the projection of L onto B along if* and 

Mr- B ^ B 

b ^ M,{h) = N{xi b) 

Then we have Mi o Mj = Mj o Mi, for 1 < i,j < n. 

Proof. Let p = I — iV be the projection onto if* along B. Since for any b G B 
and for all 1 < i < j < n, we have 

Mj o Mi{b) = Mj{xi b — p{xi b)) 

= Xj {xi b - p{xi b)) - p{xi Xj b - Xj p{xi b)) 

= Xi Xj b + Xj k\ + ^2 

with k\,k 2 G if*. Similarly, we have 

Mi o Mj (b) = Xi Xj b + Xi k[ + k' 2 , 
with k[,k '2 G if*, so that 

{Mj o Mi — Mi o Mj){b) = Xj ki+k^ — Xi k[ — k '2 & Kf~ . 

By definition of the operators Mi, we also have {Mj o Mi — Mi o Mj){b) G B 
but B n if+ = B C\ if+ ni = B f] = {0}. Consequently, the operators Mi 
commute. 

Theorem 4.2. Let B he a supplementary space of in L, connected to 1. 
Assume that B~^ C L. Then R = B(BI and the B -reduction along K_ = if* DB^ 
is canonical modulo I = {fi, . . . , fm). 

Proof. Assume that B 0 if* = L and B^ C L. Let N be the projection of L 
onto B along if* and Mi : B ^ B such that V6 G B, Mi{b) = N{xi b). Denoting 
by if = if* n B^ , we have B^ = B (B K. 

According to proposition 4.1, the operators Mi commute. Thus by theorem 
3.1, we have R = B (B J , where J is the ideal generated by if = if* C i?+ = 
ker(iV) n B~^. By construction, as if C if* C I, we have J C I. 

Let us prove by induction on the degree, that for any p G L, we have N{p) = 
p(M)(l). The property is obviously true for polynomials of degree 0. Assume 
that it is true for polynomials of degree < d and let p G L be a polynomial of 
degree d. As L is connected to 1, there exist pi, . . . ,p„ G L such that 

n 

P = ^ XrPi, 

2=1 
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with deg(pi) < d. We deduce that 

n 

p-p(M)(l) = '^Xi {pi -p*(M)(l)) + (xip*(M)(l) - M,{p,{M){l))) . 

i=l 

By induction, Pi— pi(M)(l) G AT*. Moreover as pi(M)(l) ^ B,ki = cCiPi(M)(l) — 
Mi(pi(M)(l) is in iC so that 

p-p(M)(i) G (AT*)+nL = a:*. 

This implies that, for any p G L, the projection N{p) of p onto B along AT* is 
p(M)(l): iV(p)=p(M)(l). 

As fj G AT* C A and N{fj) = 0 = /j(M)(l), we deduce from proposition 
3.3, that fj G J, so that I = J. This proves that R = B (B I where I = (K_). 

Theorem 4.2 leads to the following algorithm, for constructing a normal form 
function in A = R/ 1. 

Algorithm 4.3 Normal form fo A. 

Input: Let /i, . . . , /^ G R and I = (/i, . . . , fm) and assume that A = R/I is 
zero-dimensional. Let L be a finite vector space of R containing /i,...,/m and 
connected to 1. 

(1) Ko = {fi,...,U); 

(2) While AT„ yf AT„_i, compute A'„+i = A1+ n L, replace n by n -L 1; 

(3) Compute a supplementary vector space B of AT*(= Kn„), which is connected 
to 1; 

(4) If (f L, replace L by A+ and go to step (2). Otherwise stop. 

Output: The B-reduction along AT* is canonical modulo the ideal I = (/i, . . . , /„). 

This algorithm necessarily stops. Otherwise, L would have contained the multi- 
ples of /i, . . . , /m of degree fc, with k as large as we want. Consequently, L would 
have contained the S'-polynomials and the remainders involved in the compu- 
tation of the Grdbner basis of /i, . . . ,/m for a fixed monomial order. In other 
words, L would have contained a Grobner basis of / = (/i,...,/m). Thus a 
supplementary space B of AT* in L, connected to 1 would have had dimension 
D = dimiK(i?//) so that B+ C Liik> D. 

According to theorem 4.2, when the algorithm stops, the B-reduction along 
K = AT* n i? is canonical modulo I . 

Since L is connected to 1, the vector space B connected to 1 and supplemen- 
tary to AT* can be computed incrementally, starting from 1 (in the case where 
1 ^ A*). 

5 The Case of Laurent Polynomials 

Generalization of Grdbner basis to Laurent polynomials has already been 
considered, but they require specific adjustments (monomial order, special S'- 
polynomials [19], introduction of new variables [22], ...). In this section, we 
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describe a direct and natural extension of algorithm 4.3, to zero-dimensional 
quotient algebra of Laurent polynomials. Let S = . . . , x^] = K[x=*=] be the 

ring of Laurent polynomials in the variables x\, . . . ,Xn, with coefficients in K. 
Let /i, . . . , /m G S and let I be the ideal of S generated by these polynomials. 
We assume that A = S/ 1 is zero- dimensional. 

For any vector space V of S, we define 

= V XiV Xn V X^^ V x„ ^ V. 

Let L be a finite vector space of i? C S' connected to 1 and containing 
/i, . . . , fm- We define by induction: 

- Ko = {fi,...,U), 

- Kn+i = n L, n>0. 

As L is a finite vector space, the increasing sequence AT„ is stationary and we 
denote by Lf* the maximum (or union) of all the vector space K„,. We also have 
Ki, C I. Moreover, we have the following property: 

Proposition 5.1. Let B he a supplementary space of AT* in L, connected to 1 
and assume that C L. Let N be the projection of L^ onto B along Kf and 

Mi-. B^ B M_, -. B ^ B 

N{xib) h^ N{x~^h) 

Then we have Mi o Mj = Mj o Mi and M-i o Mi = I, for \ <i,j <n. 

Proof. The proof that the operators commute is similar to the proof of proposi- 
tion 4.1 (M+ C V^). Let us prove that M-i o Mi = I. For p = 1 — N and b G B, 
we have 



M_i o Mi{b) = M_i{xib- p{xi b)) 

= x~^{xib - p{xi b)) - p{b - x~^p{xi{b))) 

= b- x~^p{x^ b)) - p{b - x~^p{xi{b))) 

= b-\- xf^ki ^2, 

with ki,k 2 G at*, so that o Mi{b) — b G AT+ D B = K* D B = {0}, which 
proves that the inverse of Mi is M_^. 

We deduce the following theorem: 

Theorem 5.2. Let B he a supplementary space of Ki, in L, connected to 1. Lf 
B'^ C L then S = B (B I and (K_) = / where ^ = AT* n B^ 

Proof. Let K_ = 5+ n AT* and J C / be the ideal of S generated by iF. Similarly 
as in theorem 4.2, we prove that for j = 1, . . . ,m, the polynomial fj is in the 
ideal generated by K in R, so that J = L. Let us prove now that S = B (B I. 
Consider the map 



a : S 



B 



p^p{M){l) 
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By proposition 3.2, V6 G B, a{b) = &(M)(1) = b and a is surjective. Let 
p G ker(cr). Then, there exists a G Z” such that p(x) = x“q(x), with q G R. 
According to proposition 5.1, Mi is invertible so that (7(M)(1) = M““cr(p) = 0. 
By proposition 3.3, q is in the ideal of R generated by which implies that 
P € {K_) = I. This proves that ker(a) = / so that S = B (B I. 

This leads to an algorithm, similar to algorithm 4.3: 

Algorithm 5.3 Normal form for A (case of Laurent polynomials). 
lNPUT:Let fi,. . . , fm G S = , . . . , x^] and / = (/i, . . . , fm), and assume that 

A = S/I is zero-dimensional. Let L be a finite vector space of i? C S' connected to 
1 and containing gj = x“^/j for some aj G Z" and 1 < j < m. 

(1) Kq = (gi , . . . , gm) ■ 

(2) While Kn yf A„_i compute Kn+i = K// n L. 

(3) Compute a supplementary vector space B of iC*(= A„g) which is connected to 
1. 

(4) If (/l L, replace L by L+ and go to step (2). Otherwise stop. 

Output: The B-reduction along A* is canonical modulo the ideal I nK[x]. 

Remark 5./- A similar result can be obtained by replacing by 
V* = V + xiV+--- + XnV + y -L • • • -L x~^V 
and S by K[a;i, . . . , Xr,xf_^i, . . . , x/^], for any r = 1, . . . ,n. 

6 Concluding Remarks 

We end with some of the improvements or extensions that can be bring to this 
approach. 

Our presentation focuses on zero-dimensional quotient algebras. However the 
criterion that we give is valid for quotient algebras of any dimension and can 
lead to similar algorithms, by modifying the test in the while loop according to 
the “shape” of B. 

In algorithms 4.3 and 5.3, instead of extending L in all directions (step (4)), 
one can extend it using only a subset of the variables (say L+ = L-|-a;i-L-|- 
■ ■ ■ + Xr • L, 1 < r < n), in order to compute polynomials of / in this subset of 
variables. 

The hypothesis of connectivity of B can be removed in the theorem 3.1, if 
1 G H and dimK(B) = dimK(A) = D. In this case, H is a generating set of A, of 
the right dimension and thus isomorphic to A. 
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Abstract. Using multivariate polynomials, Grobner bases have a great 
theoretical interest in decoding cyclic codes beyond their BCH capability 
[1,2], but unfortunately have a high complexity [7]. From engineers point 
of view, the complexity comes in particular from the number of needed 
indeterminates, from the maximal number of needed polynomials during 
Buchberger’s algorithm (this number is unknown), and from the maxi- 
mal number of attempts before recovering the error polynomial e{X). In 
this paper we propose a new algorithm, using Grobner bases and Dis- 
crete Fourier Transform. In most of the cases this algorithm needs fewer 
indeterminates than Chen et al. algorithm [1], and at most as many as 
for XP algorithm [9] (sometimes less). In some cases the maximal num- 
ber of needed polynomials for calculations is reduced to 1. Finally, it is 
shown that only one attempt is needed for recovering e{X). 



Introduction 

For cyclic codes, many known algorithms are very efficient for decoding up to 
the designed decoding capability tscH [3]. For decoding up to the true decoding 
capability time, the best algebraic method uses Grdbner bases [1,2]. Unfortu- 
nately, the corresponding algorithm (i.e. Buchberger’s algorithm) has such a 
hight complexity [7] that it cannot be considered for implementation. 

In order to decrease the complexity, we propose a new algorithm, using 
Grdbner bases we call FG-algorithm, which seems to be possibly implemented. 
Let us consider GF{q'^), with a some primitive element. 

Set Ar = GF{q^)[X]/{X^ — 1); {n,q) = 1; /3 a primitive nth root of unity. 
Set T for the DFT in this algebra : for every polynomial a{X) in one has 
A{X) = T(a(A)) = Ei=o,...,u-i 

As ordinary, each location of a vector of length n is labeled with a power of 
[3, explicitely : /3°,/3^, . . . , If labeled with g^y that the location is j. 

We consider all elements S'(AT) in Ar whose known coefficients are given, 
in locations ■ ■ ■ As- Using multivariate polynomial equations, we examine 

what can be known about the respective corresponding s(A). Then we apply to 
error correcting codes, giving FG-algorithm. We show how to use it for decoding 
up to time errors, sometimes up to ttme + 1 ones. 

* This work was partly done under PRA9605. 
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In Part 1, after a few algebraic developments, we introduce multivariate poly- 
nomial equations. 

Part 2 introduces Grobner bases, and for codes satisfying ttme ^ tscH, we 
describe the FG-algorithm. We show why it is less complex than Ghen et al. 
algorithm [1], and less than XP algorithm [9]. 

We prove that the number of needed attempts of the whole algorithm is 
reduced to 1. 

In some cases, FG-algorithm runs with polynomials with one indeterminate. 
Then, dealing with Grobner bases is equivalent to deal simply with GGD’s. 
In this case, the maximal number of needed polynomials during buchberger’s 
algorithm is given. Xin prediction is suggested to decrease the complexity. 



1 Basics 

Throughout this paper dega{X) means “degree of a(X)”, and IF//(a(X)) is for 
the Hamming weight of a{X). 

Lemma 1.1) The set I of all a{X) in Ar, such as T{a{X)) = 0 on loca- 
tions ii,i 2 , ■ ■ ■ ,is is an ideal of Ar generated by the minimum polynomial (over 
GF{q'')) o / (I depends on the set of locations). 

2) If the set {ii, ...is} is some union of cyclotomic classes in 2Z ){(]" — 1) for 
V dividing r, then I contains an ideal of Ay . 

3) The set of all polynomials a{X) in Ay with determined values for T{a{X)) 
in locations i\,i 2 , ■ ■ ■ ,is is a coset of I. 

Proof. Points 1) and 3) are direct. 

2) The minimum polynomial of {/3*^ , • ■ ■ , /?*'’} has its coefficients from GF{q°). 

□ 

For every pair of polynomials V{X) = Vb + -|- . . . -I- and 

U{X) = Uq-I UiX -I- ... -I- we define their (ordinary) scalar product 

by < ViX),U{X) >= Ei=o.....n-i V^U.- 

We say that U{X) is P(X)-orthogonal iff < V{X),X^U{X) >= 0 for i = 
0,l,...,n — 1. We also say that the polynomial U(X) is (W, t)-orthogonal if 
<V(X),X^UiX)>=Oior\=j,j + lJ.,j + t-y 

Lemma 2. Given some s(X) = X)j=i w ^) in Ay we call locator 

of s{X) the monic polynomial Ls{X) whose roots are /3*b . . . , . Set S{X) = 
T{s{X)). 

a) A polynomial L{X) is S{X)~ orthogonal iff = 0,j = l,...,w. 

b) The set of the L{X)’s which are S{X)~ orthogonal is the ideal (Ls(X)). 

Proof, a) and b) are direct. 

□ 
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Example 1. n = 7, <7 = 8 , 1 + a + = 0. 

a) s(A:) = 1 + ^3, S{X) = aX + a^X’^ + + « 4^4 

The locator of s(AT) is Ls{X) = + aX + X"^. Ls{X) is S'(Ar)-orthogonal. 

For example < (0, a, a®, a^, a^,), (1, 0, 0, 0, 0, a^,a) >= 0 holds. 

Also, «3 _|_ (xX + aX“^ + aX^ + X^ is S'(A)-orthogonal. 

b) s(A) = 1 + A + A3 + a4, 5(A) = a'^X + aA^ + A^ + 02^4 + A^ + A®. 
L(A) = 1 + «4x + A2 is (5i, 2)-orthogonal, but not (5i, 3)-orthogonal. 

In the following, we first suppose that all coefficients of 5(A) are known, and 
then only some of them are known. 

1.1 All CoefRcients of S{X) Are Known: Linear Algebra 

We have a little refinement of Lemma 2. 

Proposition 1. Let some 5(A) be given, as well as some integer j < n — 
1. Suppose there exists some polynomial i?o(A) with degree t, which is {Sj,t)~ 
orthogonal. 

a) Suppose that some Ri{X), with degree at most t, is {Sj,t)~ orthogonal. 
Then the GCD (i?o(A), i?i(A))(= Rm{X)) is {Sj,2t — degRM{X))-orthogonal. 

b) The set E of all polynomials which have degree at most t and which are 
{Sj,f)~ orthogonal is the set of all multiples of some polynomial R(X). Each poly- 
nomial B{X) which is {Sj,t)~ orthogonal then is {Sj,2t — degB{X))-orthogonal. 

Proof, a) The GCD Rm{X) can be obtained from successive Euclidian divisions : 
Ro{X) = Qi(A)i?i(A) + i? 2 (A), i?i(A) = Q 2 (A)i? 2 (A) + i? 3 (A), . . . , 
i?M-i(A) = Qm{X)Rm{X). 

By hypothesis i?i(A) is (5j, de( 7 i?o(Al))-orthogonal. 

Moreover, from Ro{X) = Qi{X)Ri{X) + i? 2 (A), and degRo{X) < t, it is 
clear that Qi{X)Ri{X) is {Sj,t — deg R qIx)-\- deg Ri{X))-orthogonal. So, i? 2 (A) 
is (5j, de( 7 i?i(A))-orthogonal. 

Now, suppose that i?i_i(A) is (5j, de( 7 i?i_ 2 (A))-orthogonal, as well as Ri{X) 
is (5j, de(/i?i_i(A))-orthogonal. It is a direct induction to prove that Ri +iW 
is (5j, de(;i?i(A))-orthogonal. We conclude that Rm{X) is {Sj,degRM-i{X))~ 
orthogonal. Whatever degRufiX) is, Rm{X) is {Sj,degRM-i{X) — degRM{X))- 
orthogonal. 

As Rm{X) divides Rm-i{X) which is (5^ , degi?M- 2 (Al))-orthogonal, it fol- 
lows that the polynomial Rm{X) is {Sj,degRM- 2 {X) — degRM{X))-orthogonal. 
Iterating, Rm{X) is {Sj , deg Ro{X) — degRM {X))-orthogonal, and from hypoth- 
esis on Ro{X), Rm{X) is {Sj,2t — degRM {X))-orthogonal. 

b) As Rm{X) divides Ri{X) for i = 0, . . . , M — 1, then Ri{X) is {Sj,2t — 
degRi{X) )-orthogonal. 

Moreover, from the property GC'D(ai, 02 , 03 ) = GC D{GC D{a\, a 2 ) , a^) the 
statement comes from point a). □ 

Observe that Rm{X) is not necessarily the locator of s(A), as shown in the 
following example. 
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Example 2. Consider the field Gi^( 16), with a a primitive element such as 1 + 
a + = 0. In ^ 4 , consider S'(X) whose coefficients are : 

1 , 0 , a, o? ^ Of®, a®, a®, 0 , a®, 1 , a®, 1 , a. 

Set t = 4, j = 1, Rq{X) = 1+X+a^X'^+X^+X'^ and Ri{X) = a+a^X+X^. 
Rm{X) is equal to 1 + aX + X'^ . As it is irreducible over GF{16) it cannot be 

Ls{X). 



Corollary 1. With the same notations as in proposition 1 : 

a) If Rq{X) is {Sj,t+l)-orthogonal, then Rm{X) is {Sj,2t+l — degRM{X))- 
orthogonal. As a consequence, if Ro{X) is S{X)~ orthogonal then Rm{X) is 
S{X)~ orthogonal. 

b) If Rq{X) is the locator of s{X) then Rm{X) = -Ro(A). In other words, 
there does not exist any other polynomial which is (Sj,t) -orthogonal, up to scalar 
multiplication. 

c) If Ro{X) is not the locator of s{X) then Ls{X) has degree t, 

and is S{X)~ orthogonal, and has its t — degLs(X) first coefficients which are 0. 



d) Set Mj = 



( ‘5'j+i ■ ■ ■ Sj+\ 

"^i+l ‘^1+2 ■ ■ ■ 'S'j + A+l 



, with integers j > 0,pt — 



1 — A > 0,/i < 2A. Set Wh{s{X)) = u. If u> p. — X then Mj has rank equal to 
p — X. 



Proof, a) Direct, because Rm{X) divides Rq{X). 

b) From proposition 1, Rm{X) divides the locator. From a) lemma 2 the 
degree of the locator is minimal. 

c) Direct. This remark will be used for decoding with FG-algorithm (see 
proposition 4) . 

/ Sjj-\-u+i \ 



d) By point b) the matrix 



has rank u because 



V ■■■ Sjj.\-u+i J 

Ls{X) is {Sj+x- „+i, «.)-orthogonal, and by hypothesis u> p — X. 



□ 



1.2 Some Coefficients of S{X) Are Known: Polynomial Equations 

Now suppose that S{X) has its known coefficients in locations ti, Z 2 , . . . , is. 

We show how to obtain all possible explicit polynomials S{X). 

Choose some integer t strictly less than n, and consider a polynomial L{X) = 
Yq + YiX + 1^2 + . . . + Yt-\X^~^ + A* whose coefficients Yq,Yi,. . ., Yt-i are 

indeterminates. 

For every integer u within {0,...,n — 1} we define a scalar product by 
Pu {Su, Suj-l, . . . , l), (Yq, Yi, . . . ,Yf—i') >. 

Pu is a polynomial expression (over the field GF(q''f) which contains indeter- 
minates Yq,. . . , Yt-i, and possibly with some others, say Sj„, ... , Sj„ for some 
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V. It means that in the range {m, u + 1, . . . , m + t — 1} the unknown coefficients 
of S{X) are Sj -^ , ■ • ■ , Sj^. Note that v may depend on u and t. 

Choose some value uo for u. Set SE{t, uq) for a set of equations, empty at 
the beginning, into which we will add equations, using the following algorithm : 
For j = 0, . . . , n — 1 do 

If Suo+t+j is known 

then add Puo+j ~ Suo+t+j = 0 to SE{t,uo). 
else substitute —Pu„+j for Sug+t+j- 

end For 

(uo + fc is performed modulo n, for every k). 

Now, to SE(f,uo) add the polynomial equations : 

Yf -Yi=0{i = 0,...,t-1), and =0 (m = 1, . . . , w). 

Every solution of this system is a (w + t)-tuple , . . . , Sj^,yo, . . . , yt-i), 
whose components are from GF{q''). Each one gives rise to an explicit polynomial 
L{X) = yQ+y\X+y2X‘^ + . . .+yt-\X*~^+X* and to an explicit polynomial S{X) 
such as, by construction, L{X) is S'(X)-orthogonal. The number of solutions may 
be equal to 0. 

Example 3. Consider a cyclic code of length 7, over GE{8), with roots a®. 

Suppose s(AT) = a + o?X'^, with S{X) known at locations 1,3,6 : Si = a®, S3 = 
a‘^,Se = a^. Choose uq = l,t = 2. We have L{X) = Fq + YiX + X"^. SE(2,1) 
contains six equations : 

“1“ S 2 o? = 0, + 5*2 + *S*2 YqY^;P H- = 0, (x^ H- cx^Yq H- H- 

a^Y^Yi + S2Y^Yf + a^YoYf^ = 0, Y^ - Fq = 0, F® - Fi = 0, - ^2 = 0. 

Proposition 2. a) The number of solutions of the system SE{t,uo) is the num- 
ber of distinct polynomials L{X). 

b) Suppose we obtain some explicit polynomial S{X). The number of polyno- 
mials L{X) we obtain which are S{X)- orthogonal is q'^, with c = t — Wh{s{X)). 

Proof, a) Two distinct polynomials L{X) correspond to two distinct solutions 
of the system, by construction. 

b) A polynomial L{X) is a multiple of the locator Ls{X) of s(A) (see point 
b) of lemma 2). The number of monic multiples of Ls{X) with degree t (it is the 
degree of each L{X)) is q^^, with c = t — degLs(X) =t— Wh{s{X)). □ 

2 Application to Codes 

In the theory of Grobner bases, a system of polynomial equations (as SE{t, uq)) 
is associated with a basis of some ideal J. Buchberger’s algorithm constructs 
a sequence of bases (that corresponds to constructing a sequence of equivalent 
systems of polynomial equations). At the end of the process one has a basis of 
J, called a Reduced Grobner Basis (RGB). When there are only finitely many 
solutions that RGB is a system of polynomial equations with one equation having 
only one indeterminate. A zero of J is simply a solution of each system. 
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RGB are used in Chen et al. [1], in order to correct errors up to the true 
capability ttrue- 

The authors use a code over GF{q), with length n, and known syndromes 
Sj^, . . . , Sj^ . The error polynomial is represented as e(X) = X)i=i « ■ They 
find the RGB of the ideal Jm given below. (Y^~^ — 1,...,!^“^ — 1, Z” — 
1, . . . , z;; - 1, TiZf + . . . + YiZi- + . . . + - Sjj 

Each zero of Jm is a (2m)-tuple which gives the values of the Yi’s and Zi’s. 
From these values one can recover e{X) if Wij(e(X)) = to. 

As WH{e{X)) is not known, Chen et al. have to run their algorithm from 
TTi = + 1 to TO = ttrue (in the worst case). The maximal number of attempts 

is then time 

As we already said, the concrete complexity in implementing Grobner bases 
for decoding mainly comes from the number of indeterminates, the maximal 
number of polynomials appearing into the successive bases during Buchberger’s 
algorithm, and the number of attempts. 

In the following we also propose to use Grobner bases, but with fewer com- 
plexity. 

Let G be a code over GF{q), with length n relatively prime to q. 

When binary, the considered codes are from [4] (these codes satisfy dscH ^ 

dtrue)' 

Set GF{q^) for the smallest extension field of GF{q), containing f3 a primitive 
n-th root of unity, and a a primitive element. 

Suppose that c{X) is the emitted codeword. During transmission, errors oc- 
cur, and the receiver receives c{X) + e{X){= R{X)), where e(A) is the error 
polynomial. As usual, we set Si for the syndrome R{P^)- 

In the polynomial T(e(A)) = S'o -I- S\X -I- • • • -I- S'„_iX”“^(= S(X))) the 
coefficient Si is known as soon as /3* is a root of the code. Suppose these locations 
are ii,i 2 , ■ ■ ■ Gs and suppose also that the syndromes Su„, Sug+i, • ■ • , SuQ+d.BCH -2 
are known, for some j. 

Set L{X) = Y 0 + Y 1 X + .. . + + , where To, ■ ■ • 

are indeterminates. 

Example 4- (continuation of example 3). The RGB we obtain is : {a -I- a'^Yo + 
o^Yq , o'* -I- aYo + a^S' 2 , a® -I- a^Yg + aYi}. 

The zeroes {Yq =, Yi =, S 2 =) are {(a, 1, a®), (a®, a, 0)}. 

The zero (a®, a, 0) gives rise to Ls{X) = + aX + A^. 

We now determine the number of needed indeterminates for L{X). 

Proposition 3. Suppose ttrue + 1 < dscH — 1- 

a) The number of needed indeterminates is 2ttme — dscH + 1- 

h) If dBCH is odd the number of needed indeterminates is 2{ttrue — tsCH)- 

c) If dscH is even the number of needed indeterminates is 2(ftrue — tBCH) — ^- 

Proof. From the hypothesis ttrue + 1 < dBCH — 1, only the YiS are needed inde- 
terminates. In other words, in the range uq, uq -I- 1, . . . , uq -I- Ume all syndromes 
are known. 
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a) In point d) corollary 1, set A = Urue and fi = cIbch — 1- Then, if 
Wh{s{X)) > dsCH — Urue ~ 1 then the rank of Mj is cIbch — Urue — 1- We can 
eliminate (Ibch — Urue — 1 indeterminates. 

b) If d,BCH is odd we can eliminate 2tBCH — Urue indeterminates, and the 
number of remaining ones is 2{Urue — tBCn)- 

c) If d,BCH is even we can eliminate 2tBCH + 1 — Urue indeterminates, and 

the number of remaining ones is 2{Urue — tBCu) — 1- □ 

Proposition 3 means that we will run Buchberger’s algorithm with only 
‘2{Urue - tBCn) Or 2{Urue ~ tBcn) ~ 1 indeterminates. 

We note that Xin prediction [8] is less expensive than Gaussian elimination 
for obtaining the needed indeterminates. 

Now we consider binary codes, from [4]. One problem arises : when using 
FG-algorithm, even e{X) has binary coefficients it could be possible to obtain 
a zero which gives rise to some s(X) with non binary coefficients. In order to 
avoid this situation we have the following lemma. 

Lemma 3. Suppose to add to SE{t, uq) the equations P 2 k ~ -Pfc = 0 for every k 
such as Pk is already calculated. Then the zeroes we get from the RGB give rise 
to polynomials s(X) which have binary coefficients. 

Proof. It is because every explicit polynomial 5'(X) must satisfy 82 k = Si. □ 

Now we prove that only one attempt of the complete algorithm is necessary 
to find Le{X). 

Proposition 4. a) If the error is uncorrectable then the RGB reduces to a con- 
stant. 

b) Suppose the error is correctable, with Hamming weight Urue- Then each 
polynomial in the RGB has only one indeterminate, and has degree one. 

c) If the error polynomial has Hamming weight Urue — A, then add the coef- 
ficients Lq, . . . ,L\-i to the RGB. The new RGB we obtain contains only poly- 
nomials with only one indeterminate, and degrees equal to 1. 

Proof, a) Suppose to have a zero of the ideal. Then there exists a polynomial 
L{X) and a polynomial S{X) such as L{X) is 5'(Af)-orthogonal. From b) propo- 
sition 1 the degree of L(X) is greater than the Hamming weight of the error 
polynomial. From lemma 3, the (binary) error would be correctable, contrarily 
to the hypothesis. 

b) As the error is the only (binary) polynomial of weight at most Urue in 
its own coset, there exists only one zero. Moreover, the ideal must contain one 
polynomial of degree 1 for each indeterminate. 

c) Among the polynomials L{X) we obtain, there is the polynomial X’^Le(X) 

with c = t — WH{e{X)) (see point c) corollary 1). □ 

Let us give FG algorithm : 

1) Reduce the number of indeterminates (see proposition 3). Get L{X), with 

some indeterminates. 




Discrete Fourier Transform and Grobner Bases 



451 



2) Construct the initial basis (see lemma 3). 

3) Run buchberger’s algorithm, and get the RGB. 



- If the RGB is reduced to a constant, then uncorrectable error. 

- If the RGB contains only polynomials with degrees 1 and 1 variable 
each, then correct. 

- In other cases, add to the RGB some coefficients of L{X), get a new 
RGB, and correct. 



When ttrue + 1 < dscH — 1, FG algorithm compares favorably with Chen et al. 
method which needs ttrue indeterminates. 

If ttrue + 1 > dscH ~ 1 then FG-algorithm cannot run (see Step 1). 
Fortunately, this case only holds for the 8 following codes with parameters 
(length, dimension, dtriie, dscff)) among the 147 ones given in [4] : 

(33.13.10.5) , (43,15,13,7), (47,24,11,5), (47,23,12,6), (51,25,10,5), 

(51.17.12.6) , (55,30,10,5), (57,21,14,6). 

In these cases, Chen et al. method uses less indeterminates than FG algo- 
rithm. 

Example 5. Consider the code (21,7,8,5). Set /3 for an element of order 21 in 
GF{64). The cyclotomic classes containing the roots of the code are the respec- 
tive classes of , f3^ , f3'^ , f3^ . One chooses uq = 1. 

Suppose e{X) = 1 -I- X -I- X"^. We need 2 indeterminates. 

After step 1 of FG-algorithm, one has L{X) = a® -|- -|- YiX + 

Y2X^ + X^. 

The RGB we obtain is {a'^^+a^^Yo, a®-|-a^®Fi}. The unique zero is (a®^, a^^). 
The polynomial L{X) we have is then L{X) = E + a^"^X + 

j|.g j.QQ|;g are which correspond to /3°,/3,/?^. 

Example 6. Consider the same code as in example 5, namely the (21,7,8,5) 
code. It is a code with ttrue = 3. The polynomial L{X) has degree 3. Suppose 
the error polynomial is 1 -I- X'^. 

First we obtain L{X) = (a® -|- a^Yo + a^^Yi) + YqX + YiX'^ + X^. 

The RBG is -I- + a'^^Fi}. It is not reduced to a constant, so 

the error is correctable. It is not a pair of polynomials with degree 1 with one 
variable. It means that the degree of the locator is less than 3. We add a® -I- 
a^Yo + a^^Fi to the RBG, and we run buchberger’s algorithm. We then get 
-I- a®^Fi,o;^® -|- a®Fo}. The unic zero is (q;^^,q;^). The explicit L{X) is 
L{X) = A(a^^ -I- a"^X + A^), whose roots are l,a^^. As /3 = a® it means that 
erroneous locations are 1,4, which is correct. 

With FG-algorithm it is possible to correct sometimes up to ttrue + 1, as it 
is shown now. 

Proposition 5. Suppose the code has an even minimal distance 2ttrue + 2, and 
the number of its minimum weight codewords is Nmin- If we choose t = ttrue + 
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1 and if we use the previous algorithm, then we can correct some errors with 
Hamming weight ttrue + ^- In case of hard decoding, the probability to not correct 
IS at most ^ 

Proof. If e{X) is the only element in its own coset, with Hamming weight less 
that ttrue + 1) then we can correct. If not, it means that e{X) is a part of some 
codeword of minimum weight 2ttrue + 2. □ 

Example 7. Consider the binary (15,6,6,6) code, with l,a,a^ ,a^ ,a^ as roots, 
and 1 + a + alpha‘s = 0. Suppose e{X) = 1 + X + X^. We have L{X) = 
+ a'^Yo) + YoX + a^X^ + X^. 

The RGB we obtain is a® + a^X, and the unic zero is This zero gives 
rise to Le{X) = + a^'^X + + X^ . The error can be corrected. 

Sometimes every word of Hamming weight ttme + I is a part of a minimum 
weight 2(ftrue + 1) codeword (as for even Hamming codes). Then hard decoding 
cannot run. Observe that if hard decoding cannot run, then soft decoding may 
correct most errors with Hamming weight t = ttme + 1- 

A Table for FG- Algorithm up to 4 Indeterminates 

Even using a powerful DSP such as the Texas-Inst. TMS 320C62 along with 
a SDRAM of 64 MgO dynamic memory, it looks difficult to implement FG- 
algorithm with more than 4 indeterminates. 

The following Table shows what could be realistic for implementing FG- 
algorithm. It also compares the number of needed indeterminates for FG- 
algorithm (column 1) and for Chen et al. algorithm (column 2). 



Table 1. Comparison of number of needed indeterminates. 



FG 


Ghen 


remarks 


example 


correction 


1 


tsCH + 1 


ttrue 


= tBCH > 1, both even 


(31,20,6,6) 


some {tBCH + 1) 


1 


tsCH + 1 


ttrue 


= tncH + 1, dncH even 


(35,16,7,6) 


tBCH + 1 


2 


tBCH + 1 


ttrue 


= tBCH + 1, Abch odd 


(21,7,8,5) 


tBCH + 1 


3 


tsCH + 2 


ttrue 


= tncH + 1, both even 


(51,26,8,6) 


some {tBCH + 2) 


4 


tBCH + 2 


ttrue 


= tBCH + 2, dtrue odd 


(55,25,11,7) 


tBCH + 2 



Conclusion 

The proposed FG-algorithm, which uses Discrete Fourier Transform as well as 
Grobner bases, is mainly dedicated to decoding cyclic codes up to ttme for codes 
having Ume ttrue &lici ttfue ^ 2. 
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In the binary case FG algorithm needs 2{ttrue—tBCH) (resp. 2{ttrue—tBCH) — 
1) indeterminates when cIbch is odd (resp. even). It is less than Chen et al. algo- 
rithm which needs ttme indeterminates, and sometimes less than XP algorithm, 
which needs 2(ttrue — tBCn)) indeterminates whatever is (Ibch- 

For non binary codes the number of needed indeterminates for FG-algorithm 
remains the same as for binary codes, when for Chen et al. it is twice. 

FG algorithm is used for each received word. One referee considers that the 
only realistic way to use Grobner bases is to have some preprocessing step for the 
code itself. We agree with him, except when one uses 1, 2 or 3 indeterminates. 
On the other hand FG algorithm reduces the number of needed indeterminates. 
That looks interesting whatever is the used decoding algorithm with Grobner 
bases. Also, another practical advantage of FG-algorithm is to need only one 
attempt for decoding one word. 

We are now working on how to implement some particular codes from [4], 
with FG-algorithm. 
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Abstract. The paper studies self-orthogonal codes over GF{3). The 
state complexities of such codes of lengths < 20 with efficient coordinate 
ordering are found. 



1 Introduction 

A good deal of effort has been devoted to finding the minimum “trellis complex- 
ity” of linear codes. It is well known that the minimum trellis size depends on 
the order of the code coordinates [6]. 

We consider in this paper only linear codes over GF(3), whose generator ma- 
trices have no zero columns. An [n,k,d] code C has length n, dimension k and 
minimum Hamming distance d. 

A self-orthogonal code is maximal if it is not contained in a larger self- 
orthogonal code of the same length [2] . 

The support of a vector a = (o(i), 0 ( 2 ), •■., a(n)) in GF(3)” is defined by x(a) = 
{j\a(j) yf 0}. The minimum support weight, dr, of a code C is the size of the 
smallest support of any r-dimensional subcode of G. In particular d± = d. 

The weight hierarchy of G is the set {di, c? 2 , dfc}- 

The concepts of chain condition and two-way chain condition were introduced 
by Wei and Yang [13] and Forney [5] respectively. 

Definition 1. An [n,k] code C satisfies the chain condition if it is equivalent to 
a code C such that there exists a chain of subcodes of C , Di C D 2 C • • • C Dk = 
C, where, for 1 < r < k, we have dim(£)r) = r, and x{Dr) = {1,2,..., dr}- 

Theorem 1. (Duality) [13] Let C be an [n,k] code and G-*- be its dual code. 
Then {dr{C) : 1 < r < A:} = (1, 2, ..., k} — {n-\- 1 — dr{C-^) : 1 < r < n — k}. 

Definition 2. An [n,k] code C satisfies the two-way chain condition (TCC) if 
it is equivalent to a code C with the following property: there exist two chains 
of subcodes of C, the left chain C D 2 C • • • C D]t = C, and the right 
chain Df C C • • • C = C, where, for 1 < r < k, we have dim(D{:') = 
dim(I?{^) = r, x(D{') = {1,2,..., dr}, and x(D]{) = {n-dr-\-l, n-dr-\-2, . . . , n}. 



Marc Fossorier et al. (Eds.): AAECC-13, LNCS 1719, pp. 454—461, 1999. 
@ Springer- Verlag Berlin Heidelberg 1999 
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Lemma 1. [3] If C = C\ ® C\ (B ■■■ ® C\, where C\ satisfies the two-way chain 
condition, then C satisfies the two-way chain condition. 

A code has an efficient coordinate ordering (eco) if and only if it satisfies the 
two-way chain condition [5]. If C satisfies the TCC, then Si(C) = k — ki~ kn-i, 
where ki is the maximum dimension of any subcode of C whose effective length 
(support size) is i, 1 < i < n [5]. The maximum component of s((7) is denoted 
by 

Smax = max Si (C). 
i 

The minimum Smax(C') over all coordinate orderings is called the state com- 
plexity s(C') of C (see [12] for a nice survey). 

Lemma 2. [7] A coordinate ordering is efficient for a linear code C if and only 
if it is efficient for its dual code C-^ . 

We shall use notations of [2], [9], [10] whenever we refer to these papers. The 
direct sum Ci 0 Ci we will be denoted by 2Ci, etc. 

The outline of this paper is as follows. Section 2 presents a necessary and 
sufficient condition for some particular codes to have an eco. All codes of length 
<12 with eco are found. 

Section 3 considers coordinate orderings of all self-dual codes of length 16 
with weights divisible by 3. 

All self-dual codes of length 20 with eco are described in Section 4. A neces- 
sary graphical condition for a code to admit an eco is provided. 



2 Maximal Self-Orthogonal Codes of Length <12 

The maximal self-orthogonal codes of length <12 have been completely classified 
in [9]. In [2], [9], and [10] codes are efficiently described as being made up of 
various components held together by glue vectors. The following component 
codes shall be used in this paper. 

1) 63 is the [3, Ij 3] code with generator matrix (111) and the coordinate 
ordering is obviously efficient. 

2) 64 is the [4,2,3] code with generator matrix 

1110 

0121 . 

Its weight hierarchy is {3,4} and its coordinate ordering is efficient. 

3) gs is a [8, 2, 6] code and may be formed by doubling each codeword of 64. 
Its weight hierarchy is {6, 8} and its coordinate ordering is efficient. 
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4) gg is the [9,3,6] code with generator matrix 

111111000 

111000111 

120120120 . 

Its weight hierarchy is { 6 , 8 , 9} and its coordinate ordering is efficient. 

5) The “free” code /„ consists of the zero vector of length n. 

A code C is reversible if (c(o), C(i), ..., C(„_i)) € C implies (c(„_i), ..., C(i), C(o)) G 

C[8j. 



Lemma 3. [4] If the rows of a generator matrix of a reversible code generate 
the subcodes in the left chain from Definition 2, then the code satisfies the TCC. 



Proposition 1. Let C = 0630664 where a, b are nonegative integers. C has eco 
ijf a + b > 0 and ab = 0. 

Proof. If o 0 6 > 0 and ab = 0 then C has eco by Lemma 1. 

Let C have eco and C = 063 0 664 . If ab 7 ^ 0, then the weight hierarchy of C will 
be 

dgp+i = 30 4p, 0<p<6-l 

dgp = dgp-i 0 1, 1 < p < 6 

dgb+i — dgp 0 3l, 1 0 I 0 O. 

There is only one subspace of dimension 26 with weight hierarchy {di, dg , ..., dgb}, 
generated by the last 26 rows of C. In order to obtain an eco for C we would need 
two such subspaces with supports in the first and last 26 positions, respectively. 

□ 

A code C is indecomposable if it cannot be written as a direct sum of other 
codes. 



Corollary 1. All indecomposable self-orthogonal codes generated by vectors of 
weight 3 have eco. 

Proof. According to Theorem 11 [9] the only indecomposable self-orthogonal 
codes generated by vectors of weight 3 are 63 and 64 . Then the proof follows by 
Proposition 1 . □ 



Corollary 2. All self-dual codes generated by vectors of weight 3 have eco. 

Proof. According to Corollary 12 [9] the only self-dual codes generated by vectors 
of weight 3 are 64 0 64... 0 64. Then the proof follows by Proposition 1. □ 
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Lemma 4. There are exactly 3 decomposable maximal self- orthogonal codes of 
length < 12 that do not have eco and exactly 3 decomposable maximal self- 
orthogonal codes of length <12 that have eco. 

Proof. The first three codes are 63 0 64, 2c3 0 64 and 63 0 2c4. They do not have 
an eco by Proposition 1. 

The other three are 2c3, 2c4 and 864. Their coordinate ordering is efficient by 
Lemma 1. □ 

A hexad of a self-orthogonal code C of minimum weight 6 is the binary vector 
of length n which is the support of a pair of codewords c, c' G C of weight 6 [10]. 



Theorem 2. There are exactly six inequivalent codes that do not have eco and 
exactly nine inequivalent codes that have eco among maximal self-orthogonal 
codes of length < 12. 

Proof. The three decomposable maximal self-orthogonal codes of length < 12 
which do not have eco have been discussed in Lemma 4. The other three which 
do not have eco are the indecomposable codes given below. 

i) The [10,4,3] code has generator matrix 

1110000000 

0001110000 

1200001111 

0001201212 . 

The first two elements of its weight hierarchy are 3 and 6. If the code had eco 
then it would contain two codewords ci, C2 of weight 6 with supports in the first 
and last six positions, respectively, and two codewords of weight 3 with supports 
in the first and last three positions. 

Consider any two vectors of weight 6 which intersect in exactly two positions 
{c(i), C(j)} = A. 

It turns out by inspection that vectors of weight 3 cover at least one of these 
two positions for each couple in A. Therefore, the [10,4,3] code does not have 
eco. 

ii) The [11,5,3] code has generator matrix 

11100000000 

00011100000 

00000011100 

12012000012 

21000012012 . 

If the code had eco then it would contain two codewords ci , C2 of weight 6 
with supports in the first and last six positions, respectively, and wt{ci 0 C2) 
would be 10 or 11. 
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The weight distribution of C [9] shows that all weights in C are divisible by 
3. Therefore, C does not have eco. 

iii) Next code has parameters [11, 5, 6] and is the set of vectors in the ternary 
[12, 6, 6] Golay code Q 12 which are zero in one fixed position. We need two hexads 
which meet in one position only. By [10] two distinct hexads meet in 0,2,3 or 4 
coordinates. 

This concludes the proof for codes not having an eco. 

Codes of length 3 and 4, i.e.ea and 64 are shown to have eco in the beginning 
of this subsection. The other three decomposable codes which have eco are dis- 
cussed in Lemma 4. 

A direct check shows that the coordinate orderings of 3C3(9) and 4(73(12) 
codes are efficient. 

An eco for Q 12 may be seen from the following generator matrix 

111111000000 

001221110000 

010122101000 

021012100100 

010110200110 

000000211111 . 

The last code gio has parameters [10,4,6] and is the set of vectors in Q 12 
which are zero on two fixed coordinates. Elements of its weight hierarchy are 
{6,8,9, 10}. They are found applying Theorem 1. Finally an eco for giQ may be 
seen from the following generator matrix 

1002001212 

0101020222 

0011021011 

0000111221 . 

The right chain can be seen immediately. The left chain of subcodes {(H})}, 
1 < r < 4 is generated by the first r vectors from |ci -I- C2 -I- C3 -I- C4, ci -I- C4, ci -I- 
03,04} where the rows are denoted by c^, 1 < i < 4. □ 



3 Type III Codes of Length 16 



A theorem of Gleason and Pierce (see [11]) implies that a self-dual code over 
GF{q) can only have all weights divisible by some integer t > 1 in five cases 



/ : 


Q 


= 2, t 


= 2, 


II : 


q 


= 2, t 


= 4, 


III : 


q 


= 3, t 


= 3, 


IV : 


q 


= 4, t 


= 2, 


V : 


q arbitrary, t 


= 2, weight 



enumerator {x^ + ((? — l)j/^)”^^ 
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In the literature they are known as codes of types I-V [2]. 

In [2] and [10] a code is regarded as consisting of a number of components 
which are held together by glue. The glue vectors are a = 120, a = 210, x,y 
and to [10]. Vectors x and y are chosen so that llx G and I2y G Q\ 2 - 
Vectors to,ti, ...,ti 2 are the cyclic shifts of vector to = (1101000001000) and 
they generate a [13,7] code [ 8 ]. 

Theorem 3. There are two eodes that have eco and five eodes that do not have 
eco among type III eodes of length 1 6. 

Proof. All type III codes of length 16 are enumerated in [2]. The 4 c 4 code has 
eco by Lemma 1. The unique [16,8,6] code has generator matrix [I\Hg], where 
Hs is the Hadamard matrix 

11111111 

12221211 

12212112 

12121122 

11211222 

12112221 

11122212 

11222121 . 

If the columns of [/[T^s] numbered from 1 to 16 then the following column 
permutation 1,2,10,11,12,14,6,8,13,15,16,9,7,3,4,5 shows that the code has eco. 
From this column permutation the left chain of subcodes {(I9]i)}, 1 < r < 8 is 
generated by the first r vectors from {ci + 2 c 2 , cg + 2 cs, ci + 2 cs, 06 , 07 , 03 , 04 , 05 } 
and the right chain of subcodes is generated by the vectors {03 + 04 + 05 + 07, 04 + 

C5, Cq + Os, O5 + 06, 06 + O7, O3, O2, Oij. 

The 4o3 0 04 code with glue words aaaOO, OdaaO does not have eco. Looking 
at the weight distribution of the code [ 2 ] we notice that all codewords of weight 
3 are generated by the first 6 rows of the code. The first 2 elements of its weight 
hierarchy are 3 and 4. Applying Proposition 1 we complete the proof. 

The tJi 2 0 04 code does not have eco. Note that all codewords of weight 3 are 
generated by the last 2 rows of its generator matrix. The proof is similar to the 
one in the previous case. 

The 4o3 0 /4 code with glue (aOOO), (2111) does not have eco because there 
are no 2 different subspaces of dimension 4 with support (3, 6 , 9, 12}. 

The 2o3 0 gio code with glue words aOx, Oay, does not have eco since there 
is only one subspace of dimension 2 and support {3,6}. 

The last code is 63 0 P 13 with glue ato. It does not have eco because it 
contains exactly 2 vectors of weight 3 and one of them is obtained from the 
other by multiplication by 2 . □ 
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Corollary 3. The state complexities of the two previous codes that have eco are 
s(4c4) = 2 and s([16, 8, 6]) = 5. 

4 Self-Dual Codes of Length 20 

Lemma 5. Among the seven decomposable self-dual codes of length 20 there is 
one with eco and six without. 

Proof. All decomposable self-dual codes of length 20 are enumerated in [10]. The 
coordinate ordering of 5c4 is efficient by Lemma 1 . 

The 4 c3 0 2c 4 code with glue words aaaOOO, OadaOO and Q 12 0 2c4 code do 
not have eco because they do not contain 2 different subsets of dimension 4 and 
weight hierarchy {3,4, 7, 11}. 

The last 4 codes do not have eco since they do not contain 2 different subsets of 
dimension 2 and weight hierarchy {3,4}. □ 

Corollary 4. The state complexity of the previous eco code is s(5c4) = 2. 

Let Bi = {&!, ^^ 2 } denote the support of a two-dimensional subspace 

of C containing at least one codeword of weight d\. 

Note that there might exist no such Bp, such codes are studied in [1] under the 
name of antichain codes. 

We define an undirected graph T with vertex set {Bi}, two nodes Bi and Bj 
being joined by an edge if and only if p| Sj yf 0. 

Proposition 2. Let C he of length n > 2^2 • If the associated graph T is com- 
plete then C does not have eco. 

Proof. In order to find eco for C we should find two subspaces of dimension two 
with disjoint supports and weight hierarchies {di, ^ 2 }- Since P is complete such 
an ordering does not exist. □ 

Example 1. Consider all hexads in the code IO /2 (code 19 in Table III [10]). 
Any two hexads with support 8 form a set Bi. There are exactly five different 
sets Bi and any two of these sets meet in exactly two positions. Therefore, the 
associated graph is complete, regular and its valency is 4. 

By Proposition 2 we conclude that the code IO /2 does not have eco. 

Remark 1. IO /2 is the only self-dual code of length 20 and minimum distance 6 
that does not have eco. 

Let C = [g{k, d) -\-2l,k,d],l <2he & self-dual code which satisfies the chain 
condition. W.l.o.g. we may assume that its coordinates are arranged in such a 
way that the rows of its generator matrix generate the left chain of subcodes 
from Definition 2. 
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Let C be a self-dual code of length na + nt- Partition its generator matrix as 
follows 

A 0 
0 B 
D E, 

where subcodes A and D have even length Ua- Let the dimension of D be kd 
and the last kd rows be the glue words for A and B. 

Corollary 5. The state complexities of the previous eco codes are: 



s(6c3 0 /2) = s(4c3 0 gs) = 3, 



s(5c 3 0 /s) = 4, 

s(4/4 0 2/2) = s( 5/4) = s(2gg 0 2/2) = s(2gio) = 5, 



s(4/s) = 6. 



References 

1 . G. Cohen, S. Encheva and G. Zemor: Antichain codes. Designs, Codes and Cryp- 
tography (to appear) 

2. J. H. Conway, V. Pless and N. J. A. Sloane: Self-Dual Codes over GF{3) and GF{4) 
of Length not Exceeding 16. IEEE Trans. Inform. Theory 25 (1979) 312-316 

3. S. Encheva: On binary linear codes which satisfy the two-way chain condition. 
IEEE Trans. Inform. Theory Theory 42 (1996) 1038-1047 

4. S. Encheva: On repeated-root cyclic codes and the two-way chain condition. 
Springer- Verlag LNCS 1255 (1997) 

5. G. D. Forney, Jr.: Density/Length Profiles and Trellis Complexity of Lattices. IEEE 
Trans. Inform. Theory 40 (1994) 1753-1772 

6. P.A. Franaszek: Sequence-state coding for digital transmission. Bell Syst. Tech. J. 
47 (1968) 143-157 

7. T. Kasami, T. Takata, T. Fujkiwara and S. Lin: On the optimnm bit orders with 
respect to the state complexity of trellis diagrams for binary linear codes. IEEE 
Trans. Inform. Theory 39 (1993) 242-245 

8. F. J. MacWilliams and N. J. A. Sloane: The Theory of Error-Correcting codes. 
North-Holland Amsterdam (1977) 

9. C. L. Mallows, V. Pless and N. J. A. Sloane: Self-Dual Codes over GF{3) SIAM 
J. Appl. Math. 31 (1976) 649-666 

10. V. Pless, N. J. A. Sloane and H. N. Ward: Ternary codes of Minimum Weight 6 
and the Classification of the Self-Dual Codes of Length 20 IEEE Trans. Inform. 
Theory 26 (1980) 305-316 

11. N.J.A. Sloane: Self-dnal codes and lattices. In Relations between Combinatorics 
and other Parts of Mathematics Amer. Math. Soc. Proc. Sympos. Pure Math. vol. 
XXXIV (1979) 273-308 

12. A. Vardy: Trellis Structure of Codes. In Handbook of Coding Theory II. Elsevier 
(1998) 1989-2117 

13. V. K. Wei and K. Yang: On the generalized Hamming weights of product codes. 
IEEE Trans. Inform. Theory 39 (1993) 1709-1713 




Binary Optimal Linear Rate 1/2 Codes 



Koichi Betsumiya^, T. Aaron Gulliver^, and Masaaki Harada^ 

^ Nagoya University, Graduate School of Mathematics 
Nagoya 464-8602, Japan 
koichi@math.nagoya-u. ac . jp 

^ University of Victoria, Dept, of Electrical & Computer Eng. 
P.O. Box 3055, STN CSC, Victoria, B.C., Canada V8W 3P6 
agullive@ece.uvic. ca 

® Yamagata University, Department of Mathematical Sciences 
Yamagata 990-8560, Japan 
mharada@sci .kj .yamagata-u.ac.jp 



Abstract. In this paper, we classify the optimal linear [n,n/2] codes of 
length up to 12. We show that there is a unique optimal odd formally 
self-dual [20, 10, 6] code up to equivalence. We also show that at least one 
optimal linear [n, n/2] code is self-dual or formally self-dual for lengths 
up to 48 (except 38 and 40). 



1 Introduction 

A binary linear [n, k\ code G is a fc-dimensional vector subspace of F 2 , where F 2 
is the finite field of two elements. The rate of a linear [n, k] code C is defined as 
k/n. The elements of C are called codewords. The weight wt{x) of a codeword x 
is the number of non-zero coordinates. The minimum weight of C is the smallest 
weight among all non-zero codewords of C . An [n, k, d] code is an [n, k] code with 
minimum weight d. Two codes C and C' are equivalent if one can be obtained 
from the other by permuting the coordinates. The automorphism group of C 
is the set of permutations of the coordinates which preserve C. The weight 
enumerator of C is Wc{y) = where Aj is the number of codewords 

of weight i in C. 

The dual code C-*- of C is defined as C-^ = {cc e F 2 | a; • 1 / = 0 for all y G G} 
where x • y denotes the standard inner-product of x and y. A code G is self-dual 
if G = G"*-. A code G is formally self-dual if G and G-*- have identical weight 
enumerators. Self-dual codes are by definition also formally self-dual. A formally 
self-dual code is called even if the weights of all codewords are even, otherwise 
the code is odd. 

A linear [n, k] code G is optimal if G has the highest minimum weight among 
all linear [n, k] codes (see [4] for current bounds on the highest minimum weight). 
It is well known that there is a unique optimal linear code up to equivalence for 
parameters [8,4,4] and [24, 12,8], namely the extended Hamming code and the 
extended Golay code. Recently it has been shown that there is a unique optimal 
rate 1/2 linear [n, n/2] code for n = 16, 18, 22 and 28, up to equivalence [3,10,12]. 
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A natural question that now arises is how many inequivalent optimal linear 
rate 1/2 codes are there for other small lengths? In this paper, we present the 
classification of optimal linear [n, n/2] codes of length up to 12. We show that at 
least one optimal linear [n, n/2] code is self-dual or formally self-dual for lengths 
up to 48 (except 38 and 40 for which the highest minimum weight has not yet 
been determined). We also show that there is a unique optimal odd formally 
self-dual [20, 10, 6] code up to equivalence. Throughout this paper, we denote 
the highest minimum weight among all linear [n,n/2] codes, by dmax{n) (see 
Table 3 for the actual values of dmax{n)). 

2 Classification of Length np to 12 

In this section, we classify the optimal linear [n, n/2, dmax{n)] codes up to length 
12. Let Ns{n) NE{n) and No{n) denote the numbers of inequivalent optimal 
linear [n,n/2,dmax(ji)] codes which are self-dual, even formally self-dual and 
odd formally self-dual, respectively, and let A^r(n) denote the total number of 
inequivalent optimal linear [n,n/2,dmax(ji)] codes. 

2.1 Lengths 2, 4, 6 and 8 

It is trivial that there is a unique linear code for parameters [2,1,2] and [6,3,3], 
and there are exactly three inequivalent [4,2,2] codes, namely the unique self- 
dual code, the unique even formally self-dual code and the unique odd formally 
self-dual code. The extended Hamming code is the unique [8,4,4] code and the 
code is self-dual. Hence Nt{2) = 1, Nt{4) = 3, Nt{6) = 1 and Nt{8) = 1- 

2.2 Length 10 

First note that dmaxi^O) = 4. The classifications of even and odd formally self- 
dual [10,5,4] codes are given in [11] and [3], respectively. Thus Ns{W) = 0, 
A^b(IO) = 1 and A^o(lO) = 1. 

We describe how the optimal linear [10, 5, 4] codes C were classified. Every 
[10, 5] code is equivalent to a code with generator matrix of the form { I , A ) 
where A is a 5 x 5 (1, 0)-matrix. Thus we only need to consider the set of 
5x5 (1, 0)-matrices A, rather than the set of generator matrices. The set of 
matrices A was constructed, row by row, using a back-tracking algorithm under 
the condition that the first row is (00111), since the minimum weight of C is 4. 
Permuting the rows of A gives rise to different generator matrices which generate 
equivalent codes. We consider only those matrices A which are smallest among 
all matrices obtained from A by permuting its rows, where the ordering involved 
is lexicographical on the binary integers corresponding to the rows of the matrix. 
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Our computer search shows that an optimal linear [10,5,4] code which is not 
formally self-dual is equivalent to one of the two codes Cio,! and 010,2, with 
generator matrices Gio,i and Gio,2 where 

/lOOOOOOllA /lOOOOOOlllX 

01000 01011 01000 01011 
Gio,i= 00100 01101 andGio,2 = 00100 01101 , 

00010 01110 ’ 00010 10011 
\00001 10011 y \00001 10101 

respectively. Some equivalences were verified by Magma. The weight enumerators 
of Gio,i and Gio,2 are 

1 -k 18/ -k 8/ -k 5y® and 1 -k 16/ -k 12/ -k 3/, 

respectively. 

Hence we have the following: 

Proposition 1 There are exactly four inequivalent optimal linear [10, 5, 4] codes. 

2.3 Length 12 

There is a unique self-dual [12, 6, 4] code, up to equivalence. The classifications 
of even and odd formally self-dual [12, 6,4] codes are given in [1] and [3], respec- 
tively. Thus fVs(12) = 1, Ne{12) = 2 and A^o(12) = 5. 

Similar to length 10, we found all distinct 6x6 (1, 0)-matrices A such that 
the matrices { I , A) generate optimal linear [12,6,4] codes. The set of ma- 
trices A was constructed, row by row, using a back-tracking algorithm under 
the condition that the first row is (000111) and considering a lexicographical 
ordering of the rows. The codes are divided into the following 35 distinct weight 
enumerators: 

1Ti2,i = 1 -k 6/ -k 24/ -k 16/ -k 9/ -k 8/ 

Wi 2,2 = l + 8y^ + 20y^ + 14y® -k 8y^ + -k 4y® -k 2y^° 

Wi2,3 = 1 -k 9/ -k 18/ -k 13/ -k 12/ -k 6/ -k 2/ -k 3y^° 

Wi2,4 = 1 + 10/ -k 15/ -k 16/ -k 11/ -k 5/ -k 5/ -k 

1Ti2,5 = 1 + 10/ -k 16/ -k 12/ -k 16/ -k 5/ -k 4y^° 

Wi2,6 = 1 + 10/ -k 16/ -k 16/ -k 8/ -k 5/ -k 8/ 

1 Ti 2,7 = 1 + 10/ -k 18/ -k 10/ -k 12/ -k 9/ -k 2/ -k 2yi° 

1 Ti 2,8 = 1 + 10/ + 20/ + 8/ + 8/ + 13/ + 4/ 

1Ti 2,9 = 1 -k 11/ -k 14/ -k 15/ -k 12/ -k 4/ -k 6/ -k y^° 

Wi2,io = 1 -k 11/ -k 16/ -k 8/ -k 16/ -k 11/ 

IT12.11 = 1 + 12/ + 13/ + 12/ + 15/ + 7/ + 3/ + 

IT12.12 = 1 + 12/ + 14/ + 12/ + 12/ + 7/ + 6/ 
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VL12.13 = 1 + 12/ + 16t/^ + 6/ + 16/ + 11/ + 

M4i2.i4 = 1 + 12/ + 20/ + 6/ + 8/ + 11/ + 4/ + 2j/i° 

M4i2,i5 = 1 + 13/ + 12/ + 11/ + 16/ + 6/ + 4/ + 

Il"i2.i6 = 1 + 14/ + 36/ + 9y® + 4yio 

14^12,17 = 1 + 14/ + 12/ + 8/ + 16/ + 9/ + 4/ 

14^12.18 = 1 + 15/ + 32/ + 15/ + 

44^12.19 = 1 + 15/ + 33/ + 12/ + 3j/i° 

44^12.20 = 1 + 15/ + 10/ + 15/ + 12/ + 10/ + j/i° 

44^12.21 = 1 + 15/ + 16/ + 16/ + 15/ + 

4Ri 2,22 = 1 + 16/ + 30/ + 15/ + 2yi° 

44^12,23 = 1 + 16/ + 9/ + 12/ + 15/ + 3/ + 7/ + 

44^12,24 = 1 + 16/ + 10/ + 12y® + 12j/^ + 3y® + lO/ 

44^12,25 = 1 + 17/ + 27/ + 18/ + 

44^12,26 = 1 + 18/ + 24/ + 21/ 

44^12,27 = 1 + 18/ + 28/ + 13/ + 4j/io 

44^12,28 = 1 + 18/ + 8/ + 8/ + 16/ + 5/ + 8/ 

44^12,29 = 1 + 19/ + 24/ + 19/ + 

44^12.30 = 1 + 19/ + 25/ + 16y® + 3j/i° 

44^12.31 = 1 + 20/ + 22/ + 19y® + 2y^° 

44^12.32 = 1 + 22/ + 20/ + 17y® + 4j/i° 

44^12.33 = 1 + 23/ + 16/ + 23y® + y^^ 

44^12.34 = 1 + 25/ + 27/ + lOy® + 

44^12,35 = 1 + 26/ + 24/ + 13/. 

We verified that all [12, 6, 4] codes are equivalent for each weight enumerator 
except 44^12,12, 4L"i2,i8, 4L"i2,22, 4L"i2,26, 4L"i2,27 and 44^12, 31- For these six weight 
enumerators, the numbers of inequivalent codes are 2, 3, 3, 2, 2 and 2, respec- 
tively. Some equivalences were verified by Magma. Note that the codes with 
4b"i2.i, (4 = 1, 2, . . . , 5) are odd formally self-dual and the codes with 44"i2,i8 are 
even formally self-dual codes including the unique self-dual code. We consider 
only the remaining codes, and complete the classification by listing generator 
matrices ( I , Gi2y ) for the inequivalent codes where the weight enumerator of 
Gi2,i is 44^12, j. To save space, we list the matrix Gi2y using the form gi, g 2 , ■ ■ ■ , ge 
where gj is the j-th row of Gi2,i- 

Gi2,6 = 000111,001011,010011,011101,100101,111011 
Gi2,7 = 000111,001011,010011,011101,100101,111001 
Gi2,8 = 000111,001011,001101,011110,101110,110001 
Gi2,9 = 000111,001011,010101,011001,100110,111010 
Gi2,io = 000111 ,001011,010011,100101,111001,111100 
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Gi2,11 


G 


12,12-1 


G 


12,12-2 




Gi2,13 




Gi2,14 




Gi2,15 




Gi2,16 




Gi2,17 




Gi2,19 




Gi2,20 




Gi2,21 


Gi2,22-1 


G 


12,22-2 


G 


12,22-3 




Gi2,23 




Gi2,24 




Gi2,25 


G 


12,26-1 


G 


12,26-2 


G 


12,27-1 


Gi2,27-2 




Gi2,28 




Gi2,29 




Gi2,30 


G 


12,31-1 


G 


12,31-2 




Gi2,32 




Gi2,33 




Gi2,34 




Gi2,35 



000111 , 001011 , 010011 , 100101 , 101001,111100 

000111 , 001011 , 001101 , 010011 , 101110,110001 

000111 , 001011 , 010011 , 011101 , 100101,101001 

000111 , 001011 , 010011 , 011101 , 100011,101101 

000111 , 001011 , 010011 , 011101 , 011110,100101 

000111 , 001011 , 001101 , 010011 , 100101,111001 

000111 , 001011 , 010011 , 100101 , 111000,111110 

000111 , 001011 , 001101 , 010011 , 100011,110101 

000111 , 001011 , 010101 , 011001 , 100110,111000 

000111 , 001011 , 010101 , 011001 , 011111,101101 

000111 , 001011 , 010011 , 011101 , 011110,100011 

000111 , 001011 , 001101 , 010011 , 100011,111101 

000111 , 001011 , 001101 , 010011 , 100101,111000 

000111 , 001011 , 010011 , 100101 , 101001,110100 

000111 , 001011 , 001101 , 010011 , 010101,111001 

000111 , 001011 , 001101 , 010011 , 010101,101110 

000111 , 001011 , 001101 , 010011 , 100101,110001 

000111 , 001011 , 001101 , 010011 , 100011,110001 

000111 , 001011 , 010011 , 100101 , 101001,110001 

000111 , 001011 , 001101 , 010011 , 010101,111011 

000111 , 001011 , 010011 , 100101 , 101001,101111 

000111 , 001011 , 001101 , 001110 , 010011,110101 

000111 , 001011 , 001101 , 010011 , 010101,111000 

000111 , 001011 , 001101 , 010011 , 010101,101001 

000111 , 001011 , 001101 , 001110 , 010011,110001 

000111 , 001011 , 001101 , 010011 , 010101,100011 

000111 , 001011 , 001101 , 001110 , 010011,100101 

000111 , 001011 , 001101 , 001110 , 010011,100011 

000111 , 001011 , 001101 , 010011 , 010101,011001 

000111 , 001011 , 001101 , 001110 , 010011,010101 



Therefore we have the following: 

Proposition 2 There are exactly 43 inequivalent optimal linear [12,6,4] codes. 
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3 Lengths 14 and 20 

3.1 On the Classification of Length 14 

There is a unique self-dual [14, 7, 4] code, up to equivalence. Recently the clas- 
sification of even formally self-dual [14,7,4] codes has been given in [2] and [7], 
independently. The classification of odd formally self-dual [14, 7, 4] codes is given 
in [3]. Thus iVs(14) = 1, A^b( 14) = 9 and No{U) = 112. 

As with other lengths, we found all optimal linear [14, 7, 4] codes which should 
be checked to complete the classification. However, we found there were too many 
distinct weight enumerators (518 in total) to classify. These weight enumerators 
are available on the world wide web at 

http: //www.math.nagoya-u. ac . jp/~koichi/data. 

We were able to determine that there are at least 1360 inequivalent [14, 7, 4] 
codes, and there appear to be many more. Some examples of these codes are 
listed below as the right halves of the generator matrices. 

Gi 4 ,i = 0000111 , 0001011 , 0011101 , 0101101 , 0110011 , 1010001,1101111 
Gi 4,2 = 0000111 , 0001011 , 0011101 , 0101101 , 0110001 , 1001110,1110010 
Gi 4,3 = 0000111 , 0001011 , 0011101 , 0101101 , 0110011 , 1001110,1010001 
Gi 4,4 = 0000111 , 0001011 , 0011101 , 0101101 , 0110011 , 0111110,1001110 
Gi 4,5 = 0000111 , 0001011 , 0010101 , 0101101 , 0110011 , 1011000,1101110 



3.2 Odd Formally Self-Dual Codes of Length 20 

Since optimal linear codes with parameters [16,8,5] and [18,9,6] are unique 
from [3] and [12], the next open case is length 20. However it seems infeasible to 
classify all optimal linear [20, 10, 6] codes. Recently it has been shown in [8] that 
there are exactly six inequivalent even formally self-dual [20, 10, 6] codes. Thus 
we consider the odd formally self-dual [20, 10, 6] codes. 

We constructed all distinct 10 x 10 (1, 0)-matrices A such that the matrices 
{ I , A ) generate odd formally self-dual [20, 10, 6] codes using the technique 
described earlier. From these, we obtained 4872 distinct odd formally self-dual 
[20, 10, 6] codes. Our search determined that these codes are equivalent to the 
double circulant code with generator matrix ( / , i? ) where R is the circulant 
matrix with first row (1111100100). The weight enumerator of this code is 

1 -k 40y® -k 160y^ -k 130i/® -k 176y^° -k 320y“ -k 120y^^ -k 40y^^ -k 

Combining the results in [8] with this classification, we have the following: 

Proposition 3 There are exactly seven inequivalent formally self-dual [20, 10, 6] 
codes, one of which is odd. 

Combining the results in [3] with this classification, we have the following: 
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Corollary 1. The optimal formally self-dual codes (that is, the codes having the 
highest minimum weight among all formally self-dual codes of that length) are 
classified for all lengths up to 24. 

The actual numbers of inequivalent optimal formally self-dual codes are listed 
in Table 1. 

3.3 Shortened Codes 

If an optimal linear [12,6,4] code has a coordinate such that the codewords of 
minimum weight are 0 in this coordinate, then the shortened code obtained by 
deleting this coordinate is an optimal [11,6,4] code. Thus our classification of 
optimal linear [12, 6, 4] codes implies the classification of optimal [11,6, 4] codes. 
Therefore we have the following: 

Corollary 2. There are exactly two inequivalent optimal [11,6,4] codes. 

3.4 Summary 

As a summary, we list in Table 1 the actual values Nrin), Nsiji), Ne{u) and 
No{n) for 2 < n < 24. 

From Table 1, we have the following observation. 

Corollary 3. Let n{d) he the smallest length for which there is a linear rate 1/2 
code with minimum weight d. For d <8, there is a unique \n{d),n{d) /2,d\ code, 
up to equivalence. 

There arises a natural question, namely is this true for larger dl The next 
minimum weight is 9. It is not known if there is a linear [38, 19, 9] code. But 
there are inequivalent linear [40,20,9] codes, for example, D 4 o,i and £> 40, 2 in 
Table 2. 

4 Optimal Formally Self-Dual Codes 

In this section, we show that at least one optimal linear [n,n/2,dmax{n)] code 
is self-dual or formally self-dual for lengths up to 48. We use double circulant 
codes as examples of formally self-dual codes. 

A pure double circulant code of length 2n has a generator matrix of the form 

{I , R). 

where I is the identity matrix of order n and i? is an n x n circulant matrix. A 
code with generator matrix of the form 

/ a (3 ■■■ (3\ 

7 

I \ R' 

\ 7 / 
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Table 1. Nrin), Ns{n), NE{n) and No{n) for 2 < n < 24. 



Length n 


Nt (^T-) 


dmax ('^) 


Ns(n) 


Nsin) No{n) 


2 


1 


2 


1 


0 


0 


4 


3 


2 


1 


1 


1 


6 


1 


3 


0 


0 


1 


8 


1 


4 


1 


0 


0 


10 


4 


4 


0 


1 


1 


12 


43 


4 


1 


2 


5 


14 


> 1360 


4 


1 


9 


112 


16 


1 


5 


0 


0 


1 


18 


1 


6 


0 


1 


0 


20 


> 8 


6 


0 


7 


1 


22 


1 


7 


0 


0 


1 


24 


1 


8 


1 


0 


0 



Table 2. New optimal double circulant codes. 



Code 


Parameters 


First Row 


\Aut{Dn)\ 


D26 


[26, 13, 7] 


1111001000100 


78 


7140,1 


[40,20,9] 


00000000101111010011 


20 


7740,2 


[40,20,9] 


00000001010111100011 


20 


7?46 


[46,23,11] 


00000110110111110101111 


23 


7>50 


[50,25,10] 


0000000000010100111111011 


25 



where R' is an (n — 1) x (n — 1) circulant matrix, is called a bordered double 
circulant code of length 2n. These two families of codes are collectively called 
double circulant codes. Note that all double circulant codes are formally self-dual. 

In Table 2, we present several new optimal pure double circulant codes which 
are odd formally self-dual. 

The weight enumerators IRd„ of the codes in Table 2 are given below. 

= 1 + + 273/ -f 338/ -f 598y^° -f 923y^^ -f -f 1340y^^ 

-f 1300y^^ -f 923y^® -f 598y^® -f 338y^^ -f 182y^® -h -f 39/°, 

= 1 + 320/ -f 1012i/i° -h 2140?/“ -f 5400?/i2 U660y^° -h 21300?/“ 

-f37844y“ -h 60970?/“ -f 84900?/“ -f 107880?/“ -f 126040?/“ 

-f 131376/° -f 123960y“ -f 107400/^ -h 84840/° -f 61160/° 

-f 38984/° -h 1540/° -f 11020/^ -f 5240/® -f 2300/° -f 964/° 
-f260/° -f 45/2 -h 20/°, 

Wd^o,2 = 1 + 280/ -f 1010?/“ -f 2420?/“ -f 5425?/“ -f 10880y“ -f 21200?/“ 
-f38824y“ -h 61115?/“ -f 84740?/“ -f 107980?/“ -f 124920?/“ 

-f 130690/° -f 125360/° -f 108520/2 -h 84320/° -f 60210/° 
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+38704J/25 + 219701/^® + 11380y^^ + 51657/^® + 21607/^® + 952 t/®° 

+280t/®^ + 50t/®2 + 20t/®®, 

Wo^e = 1 + 33127/1^ + 9660y^2 uiAAQy^^ + 235290y^® + 99 1 7607/^^ 

+ 133 8 8 767/2° + 1961280y23 + 1879560 t/ 24 + 991760y27 + 672980 t/28 
+ 12 14407/®^ + 56925t/®2 + 33127 /°® + 1012t/°°, 

= 1 + 350t/°° + 11757/1^ + 3425t/° 2 + 11325//°° + 28650y°° + 65575 t/°® 

+ 144725//°® + 294525//°°^ + 542200//°® + 903875//°® + 1400185//2® 
+2009425//®° + 26 4 5 3 507/2® + 3226275//®° + 3625875//®"° + 3750241//®® 
+3619400//®® + 3231525//®°' + 2645275//®° + 2008375//®® + 1407350//°° 
+900725//°° + 533350//°® + 295575//°° + 150000//°° + 66625//°® 
+27275//°® + 10875//°°' + 3450//°® + 1025//°® + 305//°° + 75//°° + 50//°®. 

In Table 3, we list the values for dmax{n). We also indicate in this table 
whether one of the optimal linear codes is self-dual or formally self-dual. In the 
third (resp. fourth) column “Y” means that one of the optimal linear codes is 
a self-dual code (resp. formally self-dual code which is not self-dual) and “N” 
means that there is no self-dual [n,n/2,dmax{n)\ code (resp. formally self-dual 
[n, n/2, dmax{n)\ code which is not self-dual). For information on self-dual codes, 
see [5, Table I]. We list the references only for formally self-dual codes. 

Thus we have the following: 

Proposition 4 Let dmax{n) he the highest minimum weight among all linear 
[?r, 7T./2] codes. At least one optimal linear [n,n/2,dmax{n)] code is self-dual or 
formally self-dual for length n <36 and n = 42,44,46,48. 



Remark 1. For n = 38,40 and 50, the exact value of dmaxin) has not yet been 
determined. 

Let dkmaxfn) be the highest minimum weight among all known linear [n, n/2] 
codes. Then at least one optimal linear [n^n/2,dkmax{n)] code is self-dual or 
formally self-dual for length n < 50. 
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Length n 


dmax (^) 


SD 


FSD 




Reference (FSD) 


2 


2 


Y 


Y 




- 


4 


2 


Y 


Y 




[3], [11] 


6 


3 


N 


Y 




[3] 


8 


4 


Y 


N 




- 


10 


4 


N 


Y 




[3], [9], [11] 


12 


4 


Y 


Y 




[1], [3], [9] 


14 


4 


Y 


Y 




[2], [3], [7], [9] 


16 


5 


N 


Y 




[3] 


18 


6 


N 


Y 




[12] 


20 


6 


N 


Y 
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22 


7 


N 


Y 




[3] 


24 


8 


Y 


N 




- 


26 


7 


N 


Y 




Section 4 


28 


8 


N 


Y 




[9], [10], [11] 


30 


8 


N 


Y 




[9], [11] 


32 


8 


Y 


Y 




[9] 


34 


8 


N 


Y 




[9] 


36 


8 


Y 


Y 




[9] 


38 


8-9 


?, Y(d = 8) 


?, Y(d = 


:8) 


[9] 


40 


9-10 


N 


?, Y(d = 


9) 


Section 4 


42 


10 


N 


Y 




[9] 


44 


10 


N 


Y 




[9] 


46 


11 


N 


Y 
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48 


12 
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Abstract. We consider error-correcting codes over mixed alphabets with 
U2 binary and ns ternary coordinates, and denote the maximum cardi- 
nality of such a code with minimum distance d by N{n 2 , ns, d). We here 
study this function for short codes (n 2 -l-ns < 13) and d = 4. A computer- 
aided method is used to settle 14 values, and bounds for 34 other entries 
are improved. In the method used, codes are built up from smaller codes 
using backtracking and isomorphism rejection. Codes of short length are 
further classified. 



1 Introduction 

We consider codes in the Hamming space with n -2 binary coordinates and ns 
ternary coordinates. A code C C with minimum distance greater than 

or equal to d is said to be an {n 2 ,ns,d)\c\ code. The maximum cardinality of 
a code in F^^F^^ with minimum distance d is denoted by N{n 2 ,ns,d), and 
corresponding codes are called optimal. 

This paper is one in a series of papers where a promising approach for clas- 
sifying error-correcting codes has been considered. In the seminal paper [7], an 
old open problem in combinatorial coding theory is settled by showing that for 
binary codes, fV(10, 0, 3) = N {11, 0, 4) = 72 and A^(ll, 0, 3) = fV(12, 0, 4) = 144. 
In [6] the approach is carried over to binary/ternary mixed codes with d = 3, 
and many new bounds on N{n2,ns,3), U 2 + ns < 13 are obtained. We here 
continue the research and consider N{n2,ns,4l) for n 2 + ns < 13. 

A code can occur in many equivalent forms, and in studying optimal codes 
we are only interested in one of these forms. Two codes are said to be equivalent 
if a permutation of the coordinates and permutations of the coordinate values, 
one for each coordinate, map the codewords of one code onto those of the other. 
For mixed alphabets, we are only allowed to permute coordinates over the same 
alphabet. A mapping of a code onto its own codewords is an automorphism of 
the code. The set of all automorphisms forms the full automorphism group of a 
code. 

A procedure for showing code equivalence is discussed in Section 2. In Section 
3 the classification approach, which is based on backtrack search and rejection of 
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equivalent codes, is considered. The results of the search are presented in Section 
4, where an updated table of N{u 2 , ns, 4), ri 2 + ns < 13, is given. The new table 
contains 48 improved entries compared with the previous published table [1]. 
Classified, optimal codes are listed in the Appendix; these can also be obtained 
electronically. 

2 Code Equivalence 

Proving equivalence of codes is a nontrivial task. In this work, we transform this 
problem into the graph isomorphism problem, and we can then use the program 
nauty [4] written by Brendan McKay. The program nauty can be used to get, 
for example, the full automorphism group (the size and generators) of a graph. 
We will expressly need the feature that nauty can produce canonical labellings 
of graphs. 

The canonical labellings of two graphs coincide exactly when the graphs are 
isomorphic. We will thus have to transform codes into graphs so that two codes 
are equivalent exactly when the corresponding graphs are isomorphic. Then, by 
comparing the canonical labellings of the graphs, equivalence or inequivalence 
can be established. In the following way, such a transformation into vertex- 
colored graphs is carried out (the purpose of the vertex-coloring is that an au- 
tomorphism is not allowed to map a vertex into another vertex with a different 
color) . 

For every codeword of a given code, we create a vertex and color these with 
one color. For each of the coordinates, one vertex is created for every possible 
coordinate value (here, 2 or 3 vertices). All these are colored with one color that 
differs from that of the codeword vertices. Edges are finally inserted. There is an 
edge from a codeword vertex to a coordinate vertex exactly when that codeword 
has the corresponding value in that coordinate. Moreover, edges are inserted 
so that the subgraph induced by all coordinate vertices of any coordinate is a 
complete graph. 

To illustrate the mapping described above, we show in Fig. 1 the graphs that 
correspond to all three inequivalent (4, 1,4)2 codes. The reader is encouraged 
to ascertain that the graphs are indeed nonisomorphic — and therefore that the 
codes are inequivalent — and that there are no other inequivalent codes. 



3 Searching for Codes 

The algorithm outlined here is analogous to that presented in [6,7]. The proce- 
dure for proving code equivalence plays a central role in our backtrack search for 
codes. One difficulty in a computer search for codes is the multitude of equiva- 
lent codes. There are in fact exactly 2 ” 26 ”'®n 2 !n 3 !/|Aut(C')| codes equivalent to 
a given binary/ternary mixed code C . By fixing subcodes of a code that we are 
looking for, we can eliminate most of the branches in the search tree that lead 
to equivalent codes. This can be done in the following way. 
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Delete any one of the binary coordinates of an (ri 2 , ns, d)M code with ri 2 > 1, 
and let Co and Ci be the shortened codes which had a 0 and a 1, respectively, 
in the deleted coordinate. Trivially, the minimum distance of both Co and Ci is 
at least d, and one of the codes has at least \M/2~\ codewords. Therefore, if we 
have all inequivalent (u 2 — l,ri 3 ,d)M' codes with M' > \M/2\, we can get all 
inequivalent {n 2 ,no,d)M codes by lengthening these. 

The approach used is thus to recursively build up longer codes, starting off 
from very small codes — with one, two, or three words — where all inequivalent 
codes can be constructed by hand. In each step, one binary coordinate is added 
to the code (this is not always the case; see later discussion), and all codes with 
prescribed parameters are constructed by backtrack search (and equivalent codes 
are removed using the procedure described earlier). This process is continued un- 
til the limit of the computers used is reached — as for computing time or memory 
requirement. The backtrack search is still to be described. 

To find all {n 2 ,no,d)M codes, we fix Cq to be an (ri 2 — l,no,d)M' code 
with M' > I’M/2], and this step is done for all such subcodes that have been 
obtained earlier in this recursive process. Having fixed Cq, we search for words 
in Cl using backtracking. Actually, we then have an instance of the maximum 
independent set problem, which is an NP-complete problem [2]. (This problem 
is computationally equivalent to several other important problems such as the 
maximum clique problem.) 

An independent set in a graph G = {V, E) is a subset of the vertices V, such 
that no two vertices in the set are adjacent. For our problem, we create a graph 
with a vertex for each word with a 1 in the new coordinate that is at distance at 
least d from Cq (that is, the word is a potential word in Ci). Moreover, since the 
words in Ci must be at distance at least d from each other, edges are inserted 
into our graph whenever the corresponding words are at distance less than d 
from each other. We are now facing the problem of finding an independent set 
of size M — M' in this graph. 

In the search for an independent set (the words of C\), we developed the 
following algorithm. The algorithm works surprisingly well and is yet to be com- 
pared with other published algorithms for this graph problem. To apply the 
algorithm, we need to impose a total order on the vertices of the graph (and the 
vertices are indexed 1,2, ... , \V\ starting from the smallest vertex). For codes, 
the lexicographic order appears to be a good choice. 

For all independent m {< M — M' = M") vertices, we want to know the 
maximum value possible of the minimum index in such a set. We denote this 
function — for a given graph — by g(rn). Thus, (?(1) = \V\, and for to > 2 we 
obtain g(m) in a backtrack search by trying to find an independent set where 
the smallest index is g{m — 1) — l,g{m — 1) — 2, . . .. The values of g{m') for 
m' < m are used to effectively prune the search. A value of g{rn) is obtained by 
finding a corresponding independent set. In particular, when we obtain g{M") 
we have a solution to the original problem, and we proceed to find all solutions 
(and remove equivalent codes using the procedure described earlier). 
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Another function h{m) denoting the minimum value possible of the maximum 
index in an independent set with m vertices (for m < M" j 2) turns out to be 
useful if there is no independent set with M” vertices. Namely, if we first calculate 
the first few values of h{m), then in calculating the values of g{m), we can stop 
the whole search if for some value of m, g{m) < h{M” — m). 

For codes with only ternary coordinates, we add one ternary coordinate at 
a time. Analogously to the case of adding one binary coordinate at a time, 
(ri2, ri3 — 1, d)M' codes with M/3 < M' < N{n 2 ,ri 3 — I, d) are lengthened to get 
{n 2 ,n 3 ,d)M codes. Other possibilities for lengthening codes are discussed in [6]. 

4 The Results 

In the binary case, codes with even minimum distance 2d can be obtained by ex- 
tending codes with minimum distance 2d — 1; cf. [7]. With mixed binary/ternary 
codes with at least one ternary coordinate such an extension is not always possi- 
ble, but we have to consider the odd-distance and even-distance cases separately. 
The following (obvious) inequality, however, can in many cases be applied to im- 
prove upper bounds: 



N{n2,ri3 + l,d) < N{n2,U3,d — 1) . (1) 

The parameters of the classified codes that were needed to get the new results 
in this paper are given in Table 1. The number of codes with given parameters is 
shown, with a 0 indicating that no such codes exist (such an entry is included only 
if it improves on an earlier upper bound). The codes were built up by adding one 
binary coordinate at a time as described in the previous section. Codes with only 
ternary coordinates, however, were built up by adding one ternary coordinate at 
a time. 

With a few exceptions, the classified optimal (n2,U3,4)M codes are listed 
in the Appendix. All codes with parameters given in Table 1 can be obtained 
electronically from URL: http : //www. tcs .hut . f i/~pat/234 .html. The results 
were obtained using approximately 3 months of CPU time, and the search was 
sped up by distributing it over several computers using the program autoson [5] . 

We present exact values and best known bounds on N(n2,n3,4), U 2 +n 3 < 13, 
723 > 1 in Table 2. For the case 773 = 0, see [7]. A previous table for the same 
function was published in [1, Table II-C]. Most bounds obtained in this paper 
either come directly from a result in Table 1 or from such a result and one of 
the inequalities 



iV(n2 + 1,773,4) < 2A^(r72,n3,4) , (2) 

A^(t 72,T73 -I- 1,4) < 3A^(r72,773,4) . (3) 

By counting the marked bounds in Table 2, one can see that 48 entries have 
been improved since [1] appeared. 
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Table 1. Number of inequivalent (ri2,n3,4)M codes. 



U2 713 


M 


# 


ri2 713 


M 


# 


712 713 


M 


# 


3 


1 


1 


1 


4 


3 


9 


1122 


2 


5 


17 


35124 


3 


1 


2 


1 


4 


3 


10 


98 


2 


5 


18 


4506 


4 


1 


2 


3 


4 


3 


11 


3 


2 


5 


19 


534 


5 


1 


3 


2 


5 


3 


18 43751 


2 


5 


20 


63 


5 


1 


4 


2 


5 


3 


19 


916 


2 


5 


21 


6 


6 


1 


5 


15 


5 


3 


20 


21 


2 


5 


22 


2 


6 


1 


6 


15 


6 


3 


35 


0 


3 


5 


33 


84372 


6 


1 


7 


4 


0 


4 


2 


1 


3 


5 


34 


1123 


6 


1 


8 


3 


0 


4 


3 


1 


3 


5 


35 


10 


7 


1 


10 


2103 


1 


4 


4 


1 


3 


5 


36 


1 


7 


1 


11 


732 


2 


4 


7 


9 


3 


5 


37 


0 


7 


1 


12 


309 


2 


4 


8 


2 


4 


5 


65 


0 


7 


1 


13 


101 


3 


4 


13 


582 


0 


6 


12 


77 


7 


1 


14 


41 


3 


4 


14 


50 


0 


6 


13 


21 


7 


1 


15 


15 


3 


4 


15 


1 


0 


6 


14 


9 


7 


1 


16 


7 


4 


4 


25 12848 


0 


6 


15 


4 


8 


1 


20 


2328 


4 


4 


26 


845 


0 


6 


16 


2 


9 


1 


40 39720 


4 


4 


27 


35 


0 


6 


17 


1 


2 


2 


1 


1 


4 


4 


28 


2 


0 


6 


18 


1 


2 


2 


2 


1 


5 


4 


49 


50 


1 


6 


26 


111087 


3 


2 


2 


3 


5 


4 


50 


3 


1 


6 


27 


21516 


3 


2 


3 


1 


5 


4 


51 


0 


1 


6 


28 


3890 


4 


2 


4 


10 


6 


4 


97 


0 


1 


6 


29 


675 


4 


2 


5 


1 


0 


5 


4 


2 


1 


6 


30 


122 


4 


2 


6 


1 


0 


5 


5 


1 


1 


6 


31 


23 


5 


2 


8 


50 


0 


5 


6 


1 


1 


6 


32 


6 


6 


2 


16 


702 


1 


5 


8 


292 


1 


6 


33 


2 


1 


3 


2 


1 


1 


5 


9 


32 


2 


6 


51 


1 


2 


3 


3 


2 


1 


5 


10 


7 


2 


6 


52 


0 


3 


3 


5 


12 


1 


5 


11 


2 


3 


6 


101 


0 


3 


3 


6 


4 


1 


5 


12 


1 


0 


7 


34 


0 
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Table 2. Bounds on N{n2,ri3,4) for U 2 + < 13, > 1. Key to Table: 

Unmarked [1]; “ This paper (Table 1); ^ Eq. (2); Eq. (3); [6,7] and Eq. (1); * 

Lower bound improved in this paper. 



n2\ns 


1 


2 


3 


4 


5 


6 


7 


0 


1 


1 


1 


3 


9 


18 


33“ 


1 


1 


1 


2 


4 


12 


33 


66*" 


2 


1 


2 


3 


8 


22 


51“ 


108-132*" 


3 


2 


3 


6 


15 


*36“ 


87-100“ 


216-264*" 


4 


2 


6 


11 


28'* 


58-64“ 


144-200*" 


360-528*" 


5 


4 


8 


20 


*50'" 


108-128*’ 


288-400*" 


612-1056*" 


6 


8 


16 


34'" 


OO" 


208-256*’ 


576-800*" 


1152-2112*" 


7 


16 


26^* 


64-68*’ 


192*’ 


384-512*’ 


1152-1600*" 




8 


20 


48-50'* 


128-136*' 


384*’ 


768-1024*" 






9 


40 


96-100*’ 


256-272*' 


540-768*" 








10 


72<i 


192-200*’ 


400-544*' 










11 


144*’ 


384 












12 


256 
















n2\ri3 


8 




9 


10 


11 


12 


13 


0 


99 


243-297'’ 729-891" 


1458-2561 


4374-7029 


8019-19682 



1 162-198*’ 486-594*’ 972-1749 2916-4920 8019-14058 

2 324-396*’ 729-1188*’ 1944-3498 5589-9777 

3 486-792*’ 1296-2376*’ 3726-6791 

4 891-1584*’ 2484-4752*’ 

5 1674-3168*’ 
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In previous studies using the method discussed in this paper [6,7], only nonex- 
istence bounds were improved. Therefore, it is a positive surprise that we are 
here able to find codes that improve on the existence bounds. Moreover, both of 
these cases are settled: Ai(5,4,4) = 50 and Ai(3,5,4) = 36. 

In particular, note that we have settled two entries for purely ternary codes. 
These codes lead to other improvements for ternary codes and therefore on 
the bounds in [1, Table I]. Ternary error-correcting codes were previously also 
considered in [8] . 



Appendix 

We here list the optimal codes with at least two words classified in this paper. 
Due to the large number of optimal codes for some parameters, the following 
codes are not listed but they can be obtained electronically from the author 
as mentioned in the main text: (8,1,4)20, (9,1,4)40, (5,2, 4)8, (6,2,4)io, and 
(5,3,4)20 (there are, respectively, 2328, 39720, 50, 702, and 21 such codes). The 
codewords are compressed with the binary part in hexadecimal form and the 
ternary part in base 9. The binary and ternary parts are separately right-justified 
(so the length of the compressed words is [^-2/4] -|- [713/2]). 



A^(3,l,4) = 2: {00, 71}. 

A^(4, 1, 4) = 2: {00, F2|, {00, FOj, {00, E2|. 
fV(5,l,4) =4: {000, 0F2, 161, 191}, {000, OFO, 162, 192}. 

7V(6,1,4) = 8: {000, 0F2, 161, 191, 251, 2A1, 330, 3C0}, {000, 0F2, 161, 191, 
251, 2A1, 330, 3C2}, {000, OFO, 162, 192, 252, 2A2, 330, 3C0}. 

A^(7, 1,4) = 16: {000, 0F2, 161, 191, 251, 2A1, 330, 3C0, 431, 4C1, 550, 5A0, 

660, 690, 701, 7F1}, {000, 0F2, 161, 191, 251, 2A1, 330, 3C0, 431, 4C1, 550, 

5A0, 660, 690, 702, 7F1}, {000, 0F2, 161, 191, 251, 2A1, 330, 3C0, 431, 4C1, 

550, 5A0, 660, 692, 701, 7F1}, {000, 0F2, 161, 191, 251, 2A1, 330, 3C0, 431, 

4C1, 550, 5A0, 662, 692, 701, 7F1}, {000, 0F2, 161, 191, 251, 2A1, 330, 3C0, 

431, 4C1, 550, 5A2, 662, 692, 701, 7F1}, {000, 0F2, 161, 191, 251, 2A1, 330, 

3C2, 431, 4C1, 550, 5A2, 660, 692, 701 , 7F1}, {000, OFO, 162, 192, 252, 2A2, 

330, 3C0, 432, 4C2, 550, 5A0, 660, 690, 702, 7F2}. 

A^(2,2,4) = 2: {00, 34}. 

A^(3,2,4) = 3: {00, 58, 64}. 

A^(4,2,4) = 6: {00, 58, 64, 94, A8, FO}. 

A^(l,3,4) = 2: {000, 114}. 

A^(2,3,4) =3: {000, 114, 328}, {000, 114, 326}. 

A^(3,3,4) = 6: {000, 114, 328, 427, 605, 710}, {000, 114, 328, 427, 613, 701}, 
{000, 114, 326, 427, 613, 701}, {000, 114, 326, 425, 607, 712}. 

A^(4,3,4) = 11: {000, 114, 326, 522, 604, 828, A13, BOl, Cll, D03, F18}, {000, 
114, 326, 522, 604, 818, A21, B05, C23, D07, FIO}, {000, 114, 326, 522, 605, 828, 
A13, BOl, Cll, D03, F18}. 

A^(0,4,4) = 3: {00, 44, 88}. 

A^(l,4,4) =4: {000, 044, 156, 165}. 
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A^(2,4,4) = 8: {000, 044, 156, 165, 252, 267, 324, 370}, {000, 044, 156, 165, 238, 
283, 322, 377}. 

A^(3,4,4) = 15: {000, 044, 156, 165, 288, 307, 370, 427, 472, 513, 531, 605, 650, 
748, 784}. 

A^(4,4,4) = 28: {000, 044, 156, 165, 287, 313, 331, 476, 524, 542, 611, 655, 708, 
780, 838, 883, 917, A04, A40, B22, B66, C15, C51, D33, D88, E26, E62, F74}, 
{000, 044, 156, 165, 283, 322, 377, 452, 467, 524, 570, 605, 646, 731, 825, 876, 
981, A57, A62, B04, B40, C33, D02, D47, E20, E74, F55, F66}. 

1V(5,4,4) = 50: {0000, 0044, 0156, 0165, 0288, 0307, 0342, 0481, 0518, 0605, 
0646, 0720, 0774, 0852, 0867, 0911, 0A33, 0B25, 0B76, 0C13, 0D54, 0D60, 0E27, 
0E72, 0F38, 1038, 1122, 1177, 1250, 1264, 1313, 1426, 1475, 1531, 1611, 1755, 
1766, 1824, 1870, 1906 , 1945, 1A18, 1B57, 1B62, 1C02, 1C47, 1D88, 1E83, 1F04, 
1F40}, {0000, 0044, 0156, 0165, 0288, 0307, 0342, 0481, 0518, 0605, 0646, 0720, 
0774, 0852, 0867, 0911, 0A33, 0B25, 0B76, 0C13, 0D54, 0D60, 0E27, 0E72, 0F38, 
1038, 1122, 1177, 1250, 1264, 1313, 142 6, 1475, 1533, 1611, 1757, 1762, 1824, 
1870, 1906, 1945, 1A02, 1A47, 1B81, 1C31, 1D88, 1E55, 1E66, 1F04, 1F40}, 
{0000, 0044, 0156, 0165, 0288, 0307, 0342, 0481, 0518, 0605, 0646, 0720, 0774, 
0852, 0867, 0911, 0A33, 0B25, 0B76, 0C13, 0D54, 0D60, 0E27, OE 72, 0F38, 
1038, 1122, 1177, 1250, 1264, 1313, 1426, 1475, 1533, 1611, 1757, 1762, 1824, 
1870, 1906, 1945, 1A18, 1B81, 1C02, 1C47, 1D88, 1E55, 1E66, 1F04, 1F40}. 
A^(0,5,4) =6: {000, 044, 156, 165, 218, 281}. 

1V(1,5,4) = 12: {0000, 0044, 0156, 0165, 0218, 0281, 1038, 1083, 1122, 1177, 
1204, 1240}. 

1V(2,5,4) = 22: {0000, 0044, 0183, 0227, 0272, 1056, 1065, 1122, 1177, 1231, 
2025, 2076, 2111, 2138, 2250, 2264, 3007, 3042, 3154, 3160, 3213, 3288}, {0000, 
0044, 0183, 0227, 0272, 1056, 1065, 1122, 1177, 1213, 1231, 2025, 2076, 2111, 
2138, 2250, 2264, 3007, 3042, 3154, 3160, 3288}. 

1V(3,5,4) = 36: {0000, 0044, 0187, 0218, 1085, 1113, 1251, 1266, 2072, 2125, 

2131, 2256, 3027, 3033, 3148, 3180, 3202, 3274, 4052, 4076, 4111, 4165, 4223, 

4237, 5004, 5128, 5130, 5272, 6008, 6084, 6143, 6260, 7041, 7167, 7216, 7255}. 
A^(0,6,4) = 18: {000, 044, 156, 165, 218, 281, 338, 383, 422, 477, 504, 540, 627, 
672, 713, 731, 855, 866}. 

A^(l,6,4) = 33: {0000, 0044, 0183, 0227, 0272, 0356, 0365, 0422, 0477, 0531, 

0618, 0704, 0740, 0855, 0866, 1025, 1076, 1111, 1138, 1250, 1264, 1307, 1342, 

1454, 1460, 1513, 1588, 1633, 1681, 1726, 1775, 1802, 1847}, {0000, 0044, 0183, 
0227, 0272, 0356, 0365, 0422, 0477, 0513, 0531, 0618, 0704, 0740, 0855, 0866, 

1025, 1076, 1111, 1138, 1250, 1264, 1307, 1342, 1454, 1460, 1588, 1633, 1681, 

1726, 1775, 1802, 1847}. 

A^(2,6,4) = 51: {0000, 0044, 0156, 0165, 0218, 0281, 0338, 0383, 0422, 0477, 
0504, 0540, 0627, 0672, 0713, 0731, 0855, 0866, 1124, 1142, 1237, 1273, 1315, 

1351, 1526, 1562, 1646, 1664, 1708, 1780, 2107, 2170, 2223, 2232, 2316, 2361, 

2557, 2575, 2605, 2650 , 2748, 2784, 3011, 3055, 3066, 3400, 3444, 3488, 3822, 

3833, 3877}. 
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Abstract. In this paper, we stndy the qnotients that arise when the 
Enclidean algorithm is applied to a primitive polynomial and x“ — 1. 
We analyze the asymptotic behavior of the the number of terms of the 
quotients as n ^ oo. This problem comes from the study of Low-Density 
Parity-Check codes. We obtain two characterizations of primitive poly- 
nomials over the field with two elements that are based on the number 
of nonzero terms in two polynomials obtained via division and the Eu- 
clidean algorithm with polynomials of the form a;'* — 1. The analogous 
results do not hold for general finite fields but do restrict the order of 
the polynomials to a small set of positive integers with specific forms. 



1 Introduction 

The problems in finite fields we discuss in this paper arose from our study of 
sparse matrix codes (see Bond et al. [2]). In Section 2, we state the main results, 
which include two characterizations of primitive polynomials over the field with 
two elements that provide new criteria for the determination of the primitive- 
ness of a polynomial over the field with two elements. In Section 3, we give a 
brief description of how the properties of primitive polynomials relate to the 
construction of error-correcting codes. Section 4 contains the proofs of the main 
results and other related results. Examples are given in Section 5 to illustrate 
the theorems and their hypotheses. 

2 The Main Results 

Let IFq be the finite field with q elements and let f{x) be a polynomial in IFq[a;]. 
For each positive integer s, let gs{x) be the greatest common divisor of f{x) and 
X® — 1. By the Euclidean algorithm, there exist as{x) and bs{x) such that 

f{x)as{x) + {x"" - l)bs{x) = gs{x). (1) 

* All correspondences should be addressed to Stefen Hui. 
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We assume that deg(as(x)) < s and deg(6s(a:)) < deg(/(x)) have minimum 
degrees and are therefore unique. The question that interests us is the following: 

What is the asymptotic distribution of the number of nonzero terms in 
as{x) as s ^ oo? 

If f{x) is a factor of x’^ — 1, then gs{x) = f{x), ag(x) = 1, and bs(x) = 0. If we 
ignore this trivial case, we have the following theorem for primitive polynomials. 
Recall that a polynomial f(x) of degree d in IFq[a;] is primitive iin = q‘^ — 1 is the 
smallest integer for which the polynomial divides x" — 1. The smallest integer 
n for which f{x) divides x" — 1 is called the order of /(x). It is known that the 
order of a polynomial of degree d in IFg[x] is at most q‘^ — 1. We refer the reader 
to Lidl and Niederreiter [4] for the basic results on finite fields. 

Theorem 1. If f{x) is a primitive polynomial in lFq[x] of degree d, then for 
s € IN such that f{x) is not a factor of x‘^ — 1, 

Number of nonzero terms of a six) q‘^~^(q— 1) 

> T as s ^ oo. (2) 

s — 1 

An illustration of this theorem is given in Example 1 of Section 5. Theorem 5 of 
Section 4 gives a more technical result for polynomials that are not primitive. 

If /(x) is not primitive, then there may be more than one asymptotic value 
even if /(x) is irreducible. On the other hand, uniqueness of the asymptotic 
value does not even imply that /(x) is irreducible in general. Example 2 and 
Example 3 of Section 5 illustrate these situations. However, the converse is true 
for the finite field with 2 elements and there is a partial converse for other q's. 

Theorem 2. Let f{x) be a polynomial in lFq[x] of degree d. If for s G IN such 
that f{x) is not a factor o/x® — 1 and the asymptotic condition (2) holds, then 
/(x) is primitive or the order of f{x) is m{q‘^ — l) /{q—1), for m G {1, ... ,q—l}. 
In particular, f{x) is primitive when q = 2. 

A fact concerning the quotient (x" — l)//(x) that we need for our proofs is 
the following theorem. This theorem is known but we include a proof for the 
convenience of the reader. 

Theorem 3. Let f{x) be a primitive polynomial in IFq[x] with degree d and let 
n = 2'^ — 1. Then for any nonzero h{x) G IFq[x] with degree less than d, the 
product /i(x)(x" — l)//(x) has q‘^~^{q — 1) nonzero terms. 

There is a partial converse to Theorem 3 similar to the converse of Theo- 
rem 1. This partial converse characterizes primitive polynomials in the set of 
irreducible polynomials in IF 2 [x] and almost does so in IFq[x]. Recall that if /(x) 
is irreducible, then the order of /(x) divides q‘^ — 1 and so /(x) is a factor of 
x«'-i - 1. 

Theorem 4. Let /(x) € IFq[x\ of degree d be irreducible. If — 1)/ f{x) has 

gd-i(q,_l) nonzero terms, then f{x) is primitive or has order m{q'^ — 1) / {q — 1) , 
where m is a divisor of q — 1. In particular, f{x) is primitive when q = 2. 
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The converse of Theorem 4 is not true in general. It is possible for the order of 
/ to be a multiple of {q'^ — — 1) and yet the number of nonzero terms in 

(x" — l)//(x) not to be — 1). See Example 4 and Example 5 of Section 5. 



3 Motivation 



In linear parity-check codes, the encoded sequence c is obtained from the infor- 
mation sequence b, represented by a length n binary column vector, by a linear 
transformation c = Gb, where G is an (m -|- n) x n matrix known as the gen- 
erator matrix. A parity-check matrix for the code generated by G is any matrix 
whose rows form a basis of the orthogonal complement of the columns of G. 
Clearly, if C is an to x m invertible matrix, G~^H is also a parity matrix. A 
parity-check matrix has the property that HG = 0, and so a sequence r is an 
encoded sequence only ii Hr = 0. In general, decoding is accomplished by solv- 
ing He = Hr, so that there are as few nonzero terms in e as possible. Then r—e 
is in the range of G and r decodes to be the inverse image of r — e under G. 

Sparse matrix parity-check codes were introduced by Gallager [3] and have at- 
tracted much attention recently. See MacKay [6], Luby et al [5], and Bond et al [1]. 
One approach to the construction of the code is to find a sparse F 2 parity-check 
matrix so that the number of ones in the rows and columns are preassigned 
constants. For the decoding algorithm to perform well, the bipartite graph as- 
sociated with the matrix should have no 2 - or 4 - cycles. See [1] and [6]. 

The most common method of implementing the sparse matrix method is to 
construct an to x (to -I- n) parity-check matrix H = [ i? C] , where i? is to x n 
and C is TO X TO with the required properties and then put it into systematic 
form 

H=G~^H= [G~^RI] . 

It is easy to see that a generator matrix is then 



G = 



I 

C~^R 



Of course, there is no guarantee that the matrix G is invertible. Indeed, it is 
quite likely for G to be non-invertible. 

In our implementation, we decided to use a circulant matrix C . Recall that 
a square matrix is circulant if each row of the matrix is the cyclic shift of the 
previous row. It is easy to see that the inverse of a circulant matrix is again 
circulant. The use of a circulant matrix allows us to guarantee invertibility and 
to avoid 2 - and 4 - cycles. 

Assuming that the matrix is invertible in F 2 , the performance of the code 
depends on the number of ones in the rows of the inverse. The ideal proportion 
of ones in each row is 1/2. To see that, let p be the proportion of ones in each 
row of G~^ and let R be constructed randomly so that each column contains h 
nonzero entries. If the information sequence b has w > 1 nonzero entries, then 
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by elementary counting arguments, the expected proportion of nonzero entries 
in C~^Rb is 

1 - (1 - 

2 ■ 

For p = 1/2, the expected proportion of ones in the parity-check portion of the 
coded sequence is 1/2 and is independent of h and w. However, we see that it 
is not crucial that the proportion be exactly 1/2 since the expected number of 
nonzero entries is very flat near p = 1/2. This led us to analyze the number of 
ones in the inverse of circulant matrices. 

The ring of n x n circulant matrices in M„ [IF 2 ] is isomorphic to IF 2 — 1) 

via the identification 



[oo ai • • • a„_i ] ^ oo -I- a\x + h a„_ix” 

where [ oq ui • • • Un-i ] is the first row of a circulant matrix. The number of ones 
in a row of a circulant matrix is the same as the number of nonzero coefficients 
in the polynomial associated with the first row. The problem of inverting a, nxn 
matrix C in F 2 is equivalent to finding a{x), b{x) in F 2 [x] such that 

f{x)a{x) + (x" — l)6(x) = 1, 

where f{x) is the polynomial associated with C. The coefficients of a{x) form 
the first row of C~^ and the other rows of C~^ are obtained by cyclic shifts. 

From Theorem 1, we see that we can always construct C so that the propor- 
tion of ones in its inverse is as close to 1 /2 as desired by allowing more terms in 
the information sequence. It is also relatively easy to arrange the entries of C so 
it does not have any 2 - or 4-cycles. 

4 Proofs of the Main Theorems and Additional Results 

The following lemma is elementary. We include a proof for the convenience of 
the reader. 

Lemma 1. Let f{x) be a polynomial of degree d in IF q[x]. Then f{x) is primitive 
if and only if 

|o, l,x,. . . ,x‘^ “^1 mod f{x) = lFq[x] mod f{x), 
the set of polynomials of lFq[x] with degree less than d. 

Proof. Recall that f{x) is primitive if and only if the smallest n for which /(x) 
divides x" — 1 is n = — 1. There are q'^ distinct polynomials in 

|o, 1, X, . . . , x"^ “^1 mod /(x) 

precisely when /(x) does not divide x™ — x" for 0 < n < m < — 2, which is 

equivalent to the fact n = q‘^ — 1 is the smallest integer for which /(x) divides 
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x" — 1. Since there are exactly distinct polynomials in IFg[x] of degree less 
than d, the two sets are equally if and only if /(x) is primitive. 

The following lemma is used in the counting of nonzero terms in quotients. 

Lemma 2. Let f{x) he a monic polynomial in lFq[x] of degree d> 1. For s > 0, 
let X® = f{x)gs{x) + rs{x), with degr^ < deg/, and let Cs be the coefficient of 
x‘^~^ in r s{x) . Then for s = 1,2, 



s-l 

gs{x) = y^CfcX®~^~^. 

k—0 

Proof. We will use induction to verify the claim. Before proceeding with the 
induction, we first make some preliminary observations. First note that for s = 
0,. . . ,d— 1, gs{x) = 0 and rs{x) = x®. Therefore Cs = 0 for 0 < s < d — 1 and 
Cd-i = 1. For s = d, 

= m - (fix) - X®*) 

is the canonical decomposition and thus gd{x) = 1 and rd{x) = f{x) — x'^. It 
follows that for s = 0, . . . ,d — 1, gs+i{x) = xgs{x) + Cg. For s > d, we have 
x^+i — f(^x)gs+i{x) + rg+i(x) and also 

x®+^ = xf{x)gs{x) + xrs(x) 

= xf(x)gs(x) + CgX®* + x(rg(x) - Cgx'^”^) 

= /(x) [x5s(x) + Cs] + [cgrd(x) + x(rg(x) - Cgx'^”^)] . 

By uniqueness, we have 

gs+i(x) = xgs(x) + Cs, 

for s = d, . . . also and therefore the recurrence relationship holds for s = 0, . . .. 

We now verify the formula for gs{x) by induction. For s = 1, we have from 
above that cq = 1 if deg(/(x)) = 1 and cq = 0 if deg(/(x)) > 1. If deg(/(x)) = 1, 
gi{x) = I and cq = I and so the formula holds. If deg(/(x)) > I, 51 (x) = 0 and 
Co = 0 and the formula also holds. So the formula holds for s = 1. Assume that 
the formula is true for s. Then 



S— 1 S 

gs+i{x) = xgs{x) + Cg = ^ c^x®”'' + Cg = ^ CfcX®”'". 

k=0 fc=0 

Thus the formula holds for s + 1 and by induction for all s > 1 . 

One interesting consequence of Lemma 2 is that the quotients of /(x) into 
X® are implicit in the standard residue class table for /(x). We will next prove 
Theorem 3 which is used in the proofs of the other theorems. 

Proof of Theorem 3. Without loss of generality, we can assume that /(x) 
is monic. Let n = q'^ — 1 and let g{x) = (x” — l)//(x). We first show that 
g{x) has q‘^~^ nonzero terms. Since x" = f{x)g{x) + 1, we have from Lemma 2 
that g{x) = X)fe=o Since /(x) is primitive, we can apply Lemma 1 to 
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conclude that the number of nonzero terms of g{x) is the same as the number 
of residue classes of IFq[a;] mod f{x) such that the coefficient of x‘^~^ is nonzero, 
which is easily seen to be — 1 ). 

For the general case, let h{x) G lF<j[a;] be nonzero with deg(/i(x)) < d. Let 
c = deg(/i(x)). By Lemma 1, there exists s, 0 < s < q'^~^ such that h{x) = 
X® — f{x)u{x). Since deg(/i(x)) < deg(/(x)), deg(u(x)) = s — d<s<n. We 
have 

h{x)g{x) = x®g(x) - f{x)g{x)u{x) = x® 5 (x) - (x” - l)'u(x). 

Decompose g{x) = gi{x) + g 2 {x) with gi{x) containing the terms with exponent 
less than n — s and 52 (x) containing the terms with exponent at least n — s. 
Then x® 5 i(x) has degree less than n and all exponents of x^g 2 {x) at least n. 
Since deg{h{x)g{x)) < deg{f{x)g{x)) = n, we have x®g 2 (x) — x”m(x) = 0. It 
follows that ■u(x) = x®“”(/ 2 (x) and thus h{x)g{x) = x’^g\{x) + x®“” 52 (x). The 
exponents of x®( 7 i(x) are at least s and deg(x®“”(/ 2 (a^)) = deg(u(x)) < s and 
so the two terms have no common exponents. We conclude that h{x)g{x) has 
the same number of nonzero terms as g{x) = 51 (x) + g 2 {x) and the theorem is 
proved. 

From the proof of Theorem 3, we have the following corollary. 

Corollary 1. Let f{x) G JFq[x\ have order n. Then for all h{x) in the residue 
classes lFq[x] mod /(x), the functions ft.(x)(x" — l)//(x) have the same number 
of nonzero terms as (x" — l)//(x). 

We prove a more general version of Theorem 1 and use it to derive Theorem 1 . 

Theorem 5. Let f{x) G dFq[x] have order n and for 0 < s < n, let Ng be the 
number of nonzero terms in 6 g(x)(x” — l)//(x). (See equation (1).) Then 

Number of nonzero terms in atn+ six) Ng 

lim = — . 

t^oo tn + s n 

Proof. Let TO > 1 and let n be the order of /. Write to = tn+s, where 0 < s < n. 
Then 

t-i 

x’^-l= x®(x‘” - 1) + X® - 1 = x®(x" - 1) ^ + x® - 1. 

k=0 

Since /(x) divides x” — 1, it is easy to see that x® — 1 and x™ — 1 have the same 
greatest common divisor with /(x), say gg{x). Let /(x)os(x) + (x® — l) 6 s(x) = 
gg{x) with deg(os(x)) < s and deg( 6 s(a;)) < d. Let g{x) = (x” — l)//(x). Then 

t-i 

fix) Og-ix) - 6 s(x)x® 5 (x) ^ x^" + (x™ - l)&s(a;) 

k^O 

t-1 

= fix)ugix)+ x’”-l-x®(x”-l)^x'=” bgix) 

k^O 

= /(x)os(x) + (x® - l) 6 s(x) 

= ffsix). 
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Observe that the degree of as(x) — bs(x)x^g(x) is 

(t - l)n + s + deg(6s(x)) + deg(g(a:)) = {t - l)n + s + deg(6s(x)) +n - d, 
which is less than tn + s = m since deg(6s(a:)) < d. We can conclude that 



t-i 

am{x) = atn-\-s{x) = as{x) - bs{x)x^g{x) ^ 

Since deg(&s(x)gs(x)) < n, we see from the above formula that if we increase t 
to t + 1, the number of nonzero terms in a(t_|_i)„+s(a;) is increased by the number 
of nonzero terms in bs{x)g{x), say Ng. Note that the factor x® does not affect 
the number of nonzero terms. Thus for an increase of n in the index, the number 
of nonzero terms in am is increased by Ng. Therefore 

Number of nonzero terms in atn+s(x) Ng 

hm = — . 

t^oo tn + s n 

The proof of Theorem 5 is complete. 

Theorem 1 and Theorem 2 are now simple consequences of Theorem 5. 



Proof of Theorem 1. Let f{x) have degree d and let n = q‘^ — 1 be the order 
of f{x). By Theorem 3, the number of nonzero terms in bg(x)(x" — l)//(x) is 
bg(x) is not the zero polynomial. The polynomial bg(x) is zero if 
and only if f(x) is a factor of x® — 1. It follows from Theorem 5 that for s such 
that f(x) is not a factor of x® — 1, 

Number of nonzero terms of ag(x) g^~^(g — 1) 



Theorem 1 is proved. 



Proof of Theorem 2. Let n be the order of f(x). Then 1 < n < 1. Suppose 

that for s such that f(x) is not a factor of x® — 1, 

Number of nonzero terms of ag(x) g^~^(g — 1) 



By Theorem 5, the limit must also have the form JV/n for some > 1. Therefore 



q‘^-^(q-l)n=(q‘‘-l)IV. 



Since q'^~^ and q'^ — 1 are relatively prime, q'^ — 1 divides (q — l)n. If q'^ — 1 
divides n, then n = q'^ — 1 and f(x) is primitive. Otherwise, n < q'^ — 1 and we 
must have (q— l)n = m{q‘^ — 1) or n = m{q'^ — l)/{q— 1), for m G {1, 

When q = 2, then n = q'^ — 1 in both cases and so f(x) is always primitive when 
q = 2. 
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Proof of Theorem 4. Let f{x) G IFg[a^] be irreducible and have degree d and 
order n. Since f{x) is irreducible, d\q'^ — 1. Let — 1 = te. Then x‘^ — 1 = 

(x® — bet N be the number of nonzero terms in (x® — l)//(x). 

Since the degree of (x® — l)//(x) is less than n, the number of nonzero terms in 
(x” — l)//(x) is tN. Hence 

/ - 1 = tn and q'^~^{q - 1) = tN. 

We conclude as in the proof of Theorem 2 that n = q^—1 or n = m{q'^—l) / (g— !)• 
Since n divides — 1 in this case, we must have m a factor of g — 1. Thus /(x) 
is either primitive or has order m{q'^ — l )/(<7 — 1) with m a factor of g — 1 and 
the proof is complete. 

5 Examples 

In this section, we give examples to illustrate the theorems we presented. 
Example 1. This example illustrates Theorem 1. Let /(x) = 1 + x + x"^. Then 
/(x) is primitive in IF 2 [x]. Therefore by Theorem 1, the only nonzero asymptotic 
value is 2^/(2^ — 1) = 8/15. In fact, all primitive polynomials in IF 2 [x] of degree 
4 have the same asymptotic value. The number of nonzero terms in as{x) for s 
from 1 to 200 is shown in Figure 1. The horizontal line of height 8/15 is also 
shown. 




Fig. 1. Distribution of Asymptotic Values for 1 + x + x^. 



We illustrate Theorem 5 with the following examples. 

Example 2. Let /(x) = 1 + x + x^ + x^ + x^. Then /(x) is irreducible but not 
primitive F 2 [x]. Clearly, /(x) has order 5. For s = 0,1, 2, 3, 4, 5, we computed 
Os(x) and bs{x) such that 



/(x)os(x) + (x® - l)&s(a;) = gcd(/(x),x® - 1). 
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Let g{x) = (x^ — l)/f(x) = 1 + x. The polynomials Os(x), bs{x), and g{x)bs{x) 
are given in the following table. 

Table: The Polynomials related to 1 + x + x^ + x^ + x^ . 



s 


as{x) 


bs{x) 


g{x)bs{x) 


0 


1 


0 


0 


1 


1 


X® + X 


X^ + X^ + X® + X 


2 


1 


X^ + X 


X® + X 


3 


X 


X^ + X + 1 


X® + 1 


4 


x^ + 1 


1 + X + X® 


x^ + x'^ + 1 + X® 


5 


1 


0 


0 




Fig. 2. Distribution of Asymptotic Values for 1 + x + x^ + x^ + x'^ . 



When f{x) is not a factor of x® — 1, g{x)bs (x) has either 2 or 4 nonzero terms. 
Therefore, the asymptotic values are 2/5 and 4/5 by Theorem 5. When f{x) is 
a factor of x® — 1, the asymptotic value is 0. The number of nonzero terms of 
Os(x) for s from 1 to 200 are given in Figure 2. The horizontal lines with height 
2/5 and 4/5 are also shown. 

Example 3. Let /(x) = 1 + x + x^. In F 2 [x], 

1 + X + X® = (x^ + X + l) (x^ + x^ + l) . 

One can easily check that /(x) has order 21 and that g{x) = (x^^ — l)//(x) 
factors into 

g{x) = (x® + x"^ + x^ + X + l) (x + 1) (x® + X® + x"* + x^ + l) (l + x + x®) 

By direct computation, one can show that g(x)ba(x) has 10 nonzero terms when 
/(x) is not a factor of x® — 1. It follows that the only nonzero asymptotic value 
is 10/21. The number of nonzero terms of Os(x) for s from 1 to 200 are given in 
Figure 3. The horizontal line with height 10/21 is also shown. 
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Fig. 3. Distribution of Asymptotic Values for 1 + x + x^. 



The next example shows that for is possible for f{x) to have order 

m{q‘^ — l)/{q — 1), where m is a divisor of g — 1, and (x” — l)//(x) to have 
qd-i(^q_ nonzero terms as stated in Theorem 4. This example also shows that 
for q 2, it is possible to have nonzero asymptotic value q'^{q — l)/{q'^ — 1) for 
a polynomial that is not primitive. 

Example 4. Let /i(x) = l+2x+x^ and / 2 (x) = 4+2x+x^. Then fi{x) and / 2 (x) 
are irreducible in Fs[x]. One can check that (x® — l)//i(x) has 100 = 5^(5— 1) 

nonzero terms for z = 1,2. The order of /i(x) is 31 = (5^ — l)/(5 — 1) and that 
of / 2 (x) is 62 = 2(5^ — l)/(5 — 1). The nonzero asymptotic values of both /i(x) 
and / 2 (x) are 5^(5 — l)/(5^ ~ 1) = 100/124. 

The next example shows that the converse of Theorem 4 is not true. 

Example 5. Let /(x) = 1 + x + x^ + x'^. Then /(x) is irreducible but not 
primitive in Fs[x]. The order of /(x) is 40 = (3'^ — l)/(3 — 1) but there are 48 
nonzero terms in (x®° — l)//(x). Note that q'^~^{q — 1) = 3^2 = 54. 
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Abstract. We show that deciding square- freeness of a sparse univariate 
polynomial over 7Z and over the algebraic closure of a finite field IFp of p 
elements is NP-hard. We also discuss some related open problems about 
sparse polynomials. 



1 Introduction 

In this paper we extend the class of problems on sparse polynomials which are 
known to be NP-hard. 

We recall that a polynomial / G T^[X] over a ring TZ is called t-sparse if it 
is of the form 



t 

= ( 1 ) 

with some oi, . . . , Ot G TZ and some integers 0 < ni < . . . < n*. 

For a sparse polynomial / G ^[X], given by (1), the input size S{f) of / is 
defined as 

t 

S{f) = ^log(|o*|n* -h2) 

i=l 

where log 2 ; denotes the binary logarithm. 

Let p be a prime number. Denote by I7p the algebraic closure of the finite 
field Fp of p elements. 

Similarly, for a sparse polynomial / G I2p[A] given by (1), the input size 
S{f) of / is defined as 

t 

= '^^og{qn^ + 2) . 

where Fg C f2p is the smallest subfield of f2p containing all coefficients of /. 
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We recall that a polynomial / G Ti-lX] over the unique factorization do- 
main TZ is called square-free if it is not divisible by a square of a non-constant 
polynomial. 

We also refer to [7] for formal description of NP-hard and other related 
complexity classes. 

Since the pioneering papers [17,18,19] complexity problems on sparse polyno- 
mials have been studied quite extensively [6,8,9,10,11,12,13,15,16]. Nevertheless 
many natural questions about such polynomials remain open. 

Here we prove that testing square-freeness of sparse polynomials over Z and 
over Hp is NP-hard. Besides just being a natural problem, this question has also 
been motivated by several other possible links and applications. 

First of all we mention the problem of deciding whether a given sparse poly- 
nomial over Z has a real root. The existence of multiple roots is a major obstacle 
in obtaining efficient algorithms for this problem, see [3] . 

Another well-known related problem is sparse polynomial divisibility. That is, 
given two sparse polynomials f,g G Z[X], decide whether g\f. It has recently 
been proved [11] that under the Extended Riemann Hypothesis this problem 
belongs to the class co-NP, that is, there exists a short proof of the property 

fig- 

We also discuss such possible applications and mention several new related 
problems. 



2 Main Results 

We consider the following two problems: 

Sparse_Square-Free: Given a t-sparse polynomial f G TZfX], decide 

whether / is square-free 
and 

Sparse_GCD: Given two t-sparse polynomials f,g G TZfX], decide whether 
deggcd(/, 5 ) > 0. 

First of all we consider the case TZ = 

Theorem 1. Over TZi, Sparse_Square-Free and Sparse_GCD are equiva- 
lent under randomized polynomial time reduction. 

Proof. It is easy to see that Sparse_Square-Free is deterministic polynomial 
time reducible to Sparse_GGD. Indeed, / is square-free if and only if / and /' 
are relatively prime. 

It remains to show that Sparse_GCD can be reduced to Sparse_Square- 
Free. 

Denote by M(s, t) the set of all t-sparse polynomials over Z of size at most 
s. Obviously 

\M{s,t)\ < 

We show that for all, but at most pairs a,b G ZZ the polynomials f -\- ag 
and f -\- bg are square-free for all relatively prime pairs f,gG M{s, f). 
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Let us fix a pair f,g € M{s,t) of relatively prime polynomials. The discrim- 
inant Df^g{Y) of the polynomial f{X) + Yg{X) is a polynomial in Y of degree 
at most 

maxjdeg /, deg g} < 2L 

We remark that, because / and g are relatively prime, the bivariate polynomial 
f{X) + Yg(X) G ^[X, Y] is irreducible over Q. 

Therefore the polynomials f{X) + Y g{X) and f{X) + Y g'{X) are relatively 
prime. Recalling the resultant representation 

Df^g{Y) = Resx if{X) + Yg{X),f{X) + Yg\X)) 

we conclude that Df^g{Y) is not identical to zero and thus has at most 2® zeros. 
Considering all possible pairs f,g & M{s,t) we see that there are at most 

2‘2ts — l 

values of y which are roots of the discriminant Df^g{Y) for at least one relatively 
prime pair f,g€ M(s, t). Thus the number of pairs a,b € Z such that they are 
not roots of all discriminants Dx{Y) corresponding to all relatively prime pairs 
f,g & M{s,t) does not exceed 2^°®*. 

Now to test whether f,g G M{s,t) are relatively prime we select a random 
pair a, 6 of integers a and b with 

0 < a < 6 < 2^*® 

and test if F = (/ -|- ag){f + bg) is square-free. 

Indeed, if / and g are not relatively prime then, obviously, F is not square- 
free. 

If / and g are relatively prime then it is easy to verify that f + ag and f + bg 
are relatively prime as well. Because of the choice of a and b we conclude that 
f + ag and f + bg are square-free with probability at least 1 -I- 0(2“^*®) and thus 
F is square-free. 

It is also easy to check that the size of F is polynomially bounded in terms 
of S{f) and S{g). □ 

It has been shown in [19] that over ^ Sparse_GCD is NP-hard, see also [17], 
[18]. Therefore, from Theorem 1 we obtain the following statement. 

Corollary 1. Over 7Z, Sparse_Square-Free is 'NP-hard. 

Now we turn to the case TZ = f2p. 

Theorem 2. Over Qp, Sparse_Square-Free and Sparse_GCD are equiva- 
lent under randomized polynomial time reduction. 

Proof. As before, only the reduction of Sparse_GCD to Sparse_Square-Free 
is non-trivial. 



( 2 ^*® - 1 ) 2 ® < 2 



5ts 
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Denote by Mq{s, t) the set of all t-sparse polynomials over Fg of size at most 
s. Obviously 

\Mq{s,t)\<q^2^\ 

Using the algorithm of [21] (or one of previously known less efficient algo- 
rithms) in probabilistic polynomial time we construct an extension of Fg of 
degree N = 6st. As in the proof of Theorem 1, we see that for all, but at most 

q‘2t23st^ 

pairs a, 6 G Fgjv , the polynomials f + ag and f + bg are square-free for 
all relatively prime pairs f,g G Mq{s,t). 

Now to test whether f,g G Mq(s,t) are relatively prime we select a random 
pair a,b G FgW and test if 



F = {f + ag){f + bg) 



( 2 ) 



is square-free. 

Indeed, if / and g are not relatively prime then, obviously, F is not square- 
free. 

If / and g are relatively prime then it is easy to verify that f + ag and f + bg 
are relatively prime as well. Because of the choice of a and b we conclude that 
f + ag and f + bg are square- free with probability at least 



^2t2^3st 



q 



N 



l-kO(2-") 



and thus F is square-free. 

It is also easy to check that the size of F is polynomially bounded in terms 
of S{f) and S{g). □ 



It follows from the chain of reductions of [10], which has been used to show 
#P -hardness of the counting of rational points on a sparse plane curve over a 
finite field, that over the problem Sparse_GCD is NP-hard. 

Therefore, from Theorem 2 we obtain the following statement. 



Corollary 2. Over Op, Sparse_Square-Free is NP-/iord. 



3 Remarks 

There are several more possible extensions of our results. First of all the reduction 
we describe in Theorems 1 and 2 can be applied to polynomials given by straight- 
line programs and to multivariate sparse polynomials. 

Our reduction in Theorem 2 uses an extension of the ground field Fg. It 
would be interesting to find a reduction over the same field. For polynomial 
given by straight-line programs this can be done via considering the norm of the 
polynomial (2) 



N N 

F{X) = Norm:p^„ ,^^F{X) = J] (/(^) + H (/(^) + • 

i=l 2=1 
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We see that if / and g are given by straight-line programs of polynomial size 
then also has a straight-line program of polynomial size. On the other hand, 
unfortunately W contains a superpolynomial number of monomials. Indeed, it is 
easy to show that is T-sparse with 

T< 

where p is the characteristic of and c > 0 is an absolute constant. If p and 
t are both fixed then both the sparsity T and the S'(F) are polynomial in S{f) 
and S{g). However, for sparse polynomial with fixed number of monomials we 
do not have the corresponding NP-hardness result for computing gcd(/, 5 ). In 
both works [10] and [19] the sparsity grows together with the input size, and 
thus the final link is missing. 

Another interesting related question which probably can be studied by the 
method of this paper is deciding irreducibility of sparse polynomials. Unfortu- 
nately for irreducibility there is no analogue of the discriminant characterization 
of square- freeness. Nevertheless, it is possible that effective versions [1,2,4,5,14], 
[20,22] of the Hilbert Irreducibility Theorem (or their improvements) can help 
to approach this problem. 

Unfortunately we do not know any nontrivial upper bounds for the aforemen- 
tioned problems. For example, it will be interesting to show that testing square- 
freeness of sparse univariate polynomials over ^ can be done in PSPACE. 

Finally, it is very interesting to study similar questions for sparse integers, 
that is, for integers of the form /(2), where / is a sparse polynomial. Several 
results have been obtained in [18,19] but many more natural questions remain 
open. 
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Abstract. We present a new formulation of the Mastrovito multiplica- 
tion matrix and an architecture for the multiplication operation in the 
field GE(2"*) generated by an arbitrary irreducible polynomial. We study 
in detail several specific types of irreducible polynomials, e.g., trinomi- 
als, all-one-polynomials, and equally-spaced-polynomials, and obtain the 
time and space complexity of these designs. Particular examples, illus- 
trating the properties of the proposed architecture, are also given. The 
complexity results established in this paper match the best complexity 
results known to date. The most important new result is the space com- 
plexity of the Mastrovito multiplier for an equally-spaced-polynomial, 
which is found as (m^ — A) XOR gates and m? AND gates, where A is 
the spacing factor. 



1 Introduction 

Efficient hardware implementations of the arithmetic operations in the Galois 
field GF(2’^) are frequently desired in coding theory, computer algebra, and 
public-key cryptography [10,9,6]. The measure of efficiency is the number of gates 
(XOR and AND) and also the total gate delay of the circuit. The representation 
of the field elements have crucial role in the efficiency of the architectures for 
the arithmetic operations. For example, the well-known Massey-Omura [11] algo- 
rithm uses the normal basis representation, where the squaring of a field element 
is equivalent to a cyclic shift in its binary representation. Efficient bit-parallel 
algorithms for the multiplication operation in the canonical basis representa- 
tion, which have much less space and time complexity than the Massey-Omura 
multiplier, have also been proposed. 

The standard (polynomial) basis multiplication requires a polynomial mod- 
ular multiplication followed by a modular reduction. In practice, these two steps 
can be combined. A novel method of multiplication is proposed by Mastrovito 
[7,8], where a matrix product representation of the multiplication operation is 
used. The Mastrovito multiplier using the special generating trinomial + x + 1 
is shown to require (m^ — 1) XOR gates and w? AND gates [7,8,12,13]. It has been 
conjectured [7] that the space complexity of the Mastrovito multiplier would also 
be the same for all trinomials of the form x™ -I- a:" -I- 1 for n = 1,2, ...,m — 1. 

* This research is supported in part by Secured Information Technology, Inc. 
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This conjecture was shown to be false by Sunar and Kog [14] for the case of 
m = 2n. The architecture proposed in [14] requires {rn? — 1) XOR gates and 
rn^ AND gates, when m yf 2n. However, the required number of XOR gates is 
reduced to {rn? — y) trinomial x™ + x~ + 1 where m is even. 

In this study, we generalize the approach of [14] in several different ways. 
We describe a method of construction for the Mastrovito multiplier for a general 
irreducible polynomial. We give detailed space and time analysis of the proposed 
method for several types of irreducible polynomials. In each case, the method 
proposed in this paper gives complexity results matching the best known results 
up to date. The detailed analyzes are given in the full version of this paper [2]. 
In this paper, we give the general approach and summarize the findings. 

The most important result of the this study is in the case equally-spaced- 
polynomial (ESP), i.e., a polynomial of the form 

p(a;) + + + + l , (1) 

where kA = m. The proposed Mastrovito multiplier for an ESP requires {m?—A) 
XOR gates and AND gates. For k = 2, the ESP reduces to the equally-spaced- 
trinomial (EST) x'^ + x~ + 1, and for Z\ = 1, it reduces to the all-one-polynomial 
(AOP). Our method requires (m^ — XOR and AND gates for the trinomial 
of the form x~^ + x~^ + 1 for an even m, matching the result in [14]. Furthermore, 
our proposed architecture requires {rn? — 1) XOR gates and AND gates 
when the irreducible polynomial is an AOP. This result matches the best known 
space complexity result to date for the canonical basis multiplication based on 
an irreducible AOP, as given in [5]. Their architecture requires {rn? — 1) XOR 
gates and rn? AND gates, and has the lowest space complexity among similar 
bit-parallel multipliers [7,4,3]. 

We introduce the fundamentals of the Mastrovito multiplier and the notation 
of this paper in §2. The architecture of the Mastrovito multiplier for a general 
irreducible polynomial is described in §3. We also give detailed complexity anal- 
ysis in §4. The full version of this paper [2] contains the architectural details 
of the multipliers based on binomials, trinomials, ESPs, and AOPs. For each 
case, a detailed complexity analysis and a design example are given in [2]. The 
conclusions of this study are summarized in §5. 



2 Notation and Preliminaries 

Let p{x) be the irreducible polynomial generating the Galois field GF(2’”). In 
order to compute the multiplication c{x) = a{x)b{x) mod p{x) in GF(2™), where 
a{x), b{x), c(x) e GF(2’”), we need to first compute the product polynomial 

m— 1 \ /m— 1 

dix^ I ( 

i^O / \ i^O 




d{x) = a{x)b{x) 



( 2 ) 
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and then reduce d{x) using p{x) to find the result c{x) G GF{2'^). We can 
compute coefficients of d{x) using following matrix-vector product: 



do 




ao 


0 


0 


0 


0 


di 




ai 


Oo 


0 


0 


0 


d2 




02 


Ol 


Oo 


0 


0 


dm — 2 




Om — 2 


3 


Q-m— 4 


Oo 


0 


dm—1 


= 


Om — 1 


O^m—2 


Q-m— 3 


Ol 


Oo 


dm 




0 


dm—1 


O-m — 2 


02 


Ol 


dm+1 




0 


0 


^m— 1 


O3 


02 


d2m-3 




0 


0 


0 


' * * G-m—l 


O-m — 2 


d2m-2 




0 


0 


0 


0 


O^m— 1 



bo 

hi 

b2 



bm-2 

bni—1 



(3) 



We will denote this multiplication matrix by M and its rows denoted by M^, 
where f = 0, 1, . . . , 2m— 2. Note that the entries of M solely consist of coefficients 
of a{x). We also define the m x m submatrix of M as the first m rows of 
M, and the (m — 1) x m submatrix of M as the last (m — 1) rows of M, 
i.e., 



U(o) 



Oo 


0 


0 


•• 0 


0 


Ol 


Oo 


0 


•• 0 


0 


02 


Ol 


Oo 


•• 0 


0 



^m —2 ^m —3 ^m — 4 


• • Oo 


0 


^m —1 ^m —2 3 


• • Ol 


Oo 



(4) 



■ 0 


Om— 1 


2 * * * 


02 


dl 


0 


0 


1 * * * 


03 


(l 2 


0 


0 


0 


Om— 1 


O'm —2 


_ 0 


0 


0 


0 


O^m —1 



We use the superscripts to denote the step numbers during the reduction process, 
for example, etc. The superscript / indicates the final form of the 

matrix, for example, . The rows in the submatrix of matrix M is reduced 

using the irreducible polynomial p{x), so that, at the end of reduction, 
becomes the zero matrix. During the reduction process, the rows of L are added 
to the rows with lower indices according to the irreducible polynomial. During 
the reduction of a single row, this row is added to certain other rows. We call all 
of these rows the children of the reduced row. The final submatrix is equal 
to the so-called Mastrovito matrix Z, which is multiplied by the column vector 
b to produce the result as c = Zb. 

We use II to represent the concatenation of two vectors. For example, the 
vectors V = [vn,v„-i, . . . , uq] and W = [wn, Wn-i, ■ ■ ■ , wo] can be concatenated 
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to form a new vector of length 2{n+ 1) as follows: 

V II W = [ Vn V„-i ■■■ Vi Vo Wn Wn-1 ' ' ' Wi Wq ] . 

In general, two vectors need not be of equal length in order to be concatenated. 
Also during the reduction, the vectors are shifted to the right or left, where 
the empty locations are filled with zeros. We use the right and left arrows to 
represent the right and left shifts. For example, (V ^ 3) and (V ^ 2) represent 
right and left shifts of the vector V by 3 and 2 positions, respectively, which are 
explicitly given as 

(V ^ 3) = [ 0 0 0 Vn Vn-l • • • Vq V^ V4 V3 ] , 

(V ^ 2) = [ V„-2 Vn -3 Vn -4 Vn -5 Vn-6 ■ ■ ■ Vi Vq 0 0 ] . 

Furthermore, at certain steps of the reduction, vectors are used to form matrices. 
For example, to form a matrix using the last (n— 1) entries of the above vectors, 
the following notation is adopted: 



V 






'^n—2 3 '^n— 4 


^n— 5 


■ ■ V3 


V2 


Vl 


Vo 


(V- 


3) 


= 


0 Vn Vn-l 


Vn-2 ■ 


■ ■ Ve 


V5 


V4 


V3 


L(v- 


2) . 


3x(n-l) 


'^n—4 '^n—5 '^n—6 


Vn-7 ■ ' 


■ ■ V2 


Vl 


0 


0 



As seen above, although the original vectors are longer, only the last (n — 1) 
entries are used, and the rest is discarded. 

During the reduction operation, we frequently encounter certain special ma- 
trices. A particular matrix type is the Toeplitz matrix, which is a matrix whose 
entries are constant along each diagonal. It is well-known that the sum of two 
Toeplitz matrices is also a Toeplitz matrix [1]. This property will be used to 
establish a recursion. 

Finally, we note that the gates used in the proposed design are 2-input AND 
and XOR gates, whose delays are denoted by Ta and Tx, respectively. 

3 General Polynomials 

We start with the most general form an irreducible polynomial as 

p{x) = cc"*’ -I- -I- 1- -I- , (6) 

where rii for t = 0, 1, 2, . . . , A: are positive integers with the property 
m = Uk> Uk-i > • • • > ni > no = 0 . 

The difference between the highest two orders, i.e., Uk — Uk-i = m — Uk-i will 
be denoted by A. 

In the following, we first summarize the general outline of the reduction 
process, and then, propose a method to obtain the same result more efficiently. 

When the irreducible polynomial (6) is used to reduce the rows of each 
row will have k children. The one corresponding to the constant term x^° = 1 is 
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guaranteed to be added to a row in U, but the others might be added to the rows 
of L, and will need to be reduced further. To simplify the notation and observe 
the regularity, we use k additional matrices. The children produced due to the 
reductions corresponding to the a:"* term will be added to the m x m matrix 
for i = 0,1, ... ,{k — 1). The children that fall back into the submatrix L 
are stored in which are to be reduced later. 

By introducing the Xi matrices, we preserve the matrix during the 

reduction. At the end of the first step, i.e., when every row of matrix is 
reduced exactly once, the following matrices will be produced: 

u(i) = + XO^i) + Xl^i) + • • • + X(k - 1)^^^ 



where 



0 0 0 



0 0 0 

Xi^ ^ = 0 drn—l O.jn—2 

0 0 dm—1 



■ dm+1 



0 rii — 1 

ai n* (8) 

d2 rii + 1 



0 0 0 



1 * ‘ * ^m—rii + 1 ^m—rii \ Tft \ 



for i = 0,l,...,(fc — 1). The part of matrix M, which is to be further reduced 
after the first step, will be 



0 • 


• 0 


‘m-1 


^m-2 


;(i) 

‘zi +2 


A+l 


0 


0 • 


• 0 


0 


^m—1 


CO 

. 


,(1) 

'-A+2 


1 


0 • 


• 0 


0 


0 •• 


'■m-1 


‘‘m-2 


— 1) 3 


0 • 


• 0 


0 


0 •• 


0 


‘m-1 


’^(fc-1) - 2 


0 • 


• 0 


0 


0 •• 


0 


0 


■‘‘(fc-i)-i 


0 • 


• 0 


0 


0 •• 


0 


0 


TO — 2 



As seen above, the new matrix which will be reduced in the next step is also 
triangular. This means the new children will also be in the same form, except 
that they will contain more zero terms at the beginning. Thus, it is clear that if 
the same procedure is recursively applied, the submatrix will never change, 
and the forms of the matrices Xi and L will remain the same after every step, 
i.e., Xi matrices will be trapezoidal and L will be triangular. The entries which 
are zero and outside the indicated geometric regions after the first iteration will 
always remain zero. Only the values inside these regions will be changed during 
the rest of the reduction. The number of nonzero rows of L after step j, i.e., the 
number of nonzero rows in is given as 



O = (w - 1) - j (to - n(fc_i)) = (to - 1) - j A 
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since there are (m— 1) nonzero rows initially, i.e., ro = {m— 1), and the number 
is reduced by Z\ = (m — n(^k-i)) after each step. Thus, it will take 



N[m, A] 



m — 1 
A 



( 11 ) 



steps to reduce the whole matrix This number is also equal to the number 
of nonzero terms in the row Lq at the end of step j, i.e., the number of nonzero 
terms in the row Lgl'* for j = 1,2,..., N[m, A] — l. Note that the range of j does 
not include j = N[m, A], as the number of nonzero rows becomes zero after step 
N[m, A], but the number rj will be negative for j = N[m, A], 

First the matrix M is divided into the upper and lower submatrices and 
. The matrix is reduced into k matrices using the irreducible polynomial, 
while is kept unchanged. The upper and lower parts of the new matrices 
are then separated. The upper parts form the matrices and the lower 

parts are accumulated into the matrix The merged lower parts are to be 
further reduced in the next step. This procedure is repeated until the matrix L 
becomes the zero matrix, i.e., all rows are reduced. The total reduction process 
is illustrated in Figure 4. The sum of the matrices in the last row, i.e., and 
Xi(l) for i = 0, 1,. . .,(fc — 1), yields the Mastrovito matrix Z. 

A close inspection shows that all submatrices formed by the nonzero rows of 
the matrices produced after the first iteration are Toeplitz matrices. When the 
matrix is reduced further, the children will be added to the nonzero rows of 
the matrices and which are Toeplitz submatrices. Since the sum of 

two Toeplitz matrices is also a Toeplitz matrix [1], the submatrices formed by 
the nonzero rows will all be Toeplitz submatrices. Furthermore, these matrices 
are special Toeplitz matrices, computing only the first nonzero rows of is 

sufficient to reconstruct them. Similarly, the matrix M and the submatrices 
and can be constructed using only the row M^-i. Furthermore, since all 
first nonzero rows of the matrices are identical, we only need to compute 

one of them. Thus, it suffices to work on XOq'^^ whose final value is computed 
as follows: 

N[m,A] — l 

^ + lW + . . . + . (12) 

j=o 

This will be used with to construct the final matrix Z. First, the rows 
Z„^ are constructed by adding the corresponding rows of the matrix to 
for i = 0,1, ■ ■ ■ ,k — 1. Then, they are extended to larger vectors Yi by 
concatenating the necessary parts of to their beginning so that the shifts of 
Yi produce the rows below them up to the row rii+i, or up to the row (m — 1) 
if i = (/c — 1). This will simplify the construction of the matrix Z. To further 
simplify the representations, the first nonzero rows of Xi, which are all identical, 
represented by the vector V, will be used. Instead of referring to and 
we will use the original multiplication matrix M or its entries at. The summary 
of the proposed method is given below: 
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1. First, we compute V given as 

N[m,A] — l 

V = ^ = [ 0 Vm-l Vm-2 ' ’ ' V3 V2 Ul ] (13) 

j=0 

using the recursive definition of 

^ (m- rii)) (14) 

(fc - 1) > t > 0 

r{j-i) > (m - rii) 

for 1 < j < iV[m, Z\] — 1 to reduce everything to the sum of shifts of the row 
or equivalently to the sum of rows in M. The above summation means 
that the row is shifted (m — n(fc_i)) times, (m — nk- 2 ) times, etc., 

until all entries become zero. Then, these are all added to form Lg . Here 
we note that V is not computed until it is completely reduced to the sum of 
rows of M, since there might be cancellations. This fact will be taken into 
account during the complexity analysis. 

2. Then, we compute Z„^ for t = 0 , 1, . . . , (fc — 1) using the following recursive 
relations: 

Zo = [ao] I! [V]ix(m- 1 ) = [ ao Vm-l Vm-2 ••• V3 V2 ] , (15) 

II 1 X (m— Zij) ) T ) (16) 

where Ai = {m — n(j_i)) for i = 1, 2, . . . , (fc — 1). Thus, if V and are 

given as 



V = [ 0 Vm-l Vm-2 ’ ’ ' «3 ^'2 ^^1 ] , 

Wm-1 Wm-2 ■■■ W 3 W 2 Wl ] , 

then Zrn is obtained as 

'^rii — I (Un^ — 1 T Utti—i) •*• ) (^Wm—l~^Vm—l — Ai} *** 

• • • {wAi + 1 + Wl) ] ■ 

3. By extending Z„,, we find Yi = [ M™+„Jix(zi(i+i)-i) II ] for t = 
0, 1, . . . , (fc — 1) as follows: 

Yl — I — l ' ' ' V^rii-ll ^rii (^n, — 1 4“ l) 

••• (ao+Vm-rii) (Wm-1 + Vm-Ui-l) ■■■ (Wm + 1 + Vl) ].( 17 ) 
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4. Finally, the whole Z matrix is constructed as follows: 



YO 

YO ^ 1 



0 

1 



YO ^ (Z\i - 1) 

Y1 

Y1 ^ 1 



(m - 1) 
ni 

{ni + 1 ) 



Z 



Y(i-1)^(A-1) 

Yi 

Yi ^ 1 



{rii - 1 ) 



n* 

(m + 1 ) 



(18) 



Y(k - 2) ^ - 1) 

Y(k-l) 

Y(k- 1) ^ 1 



L Y(k - 1) ^ (Z\ - 1) 



(n(fe_i) - 1) 
^(fc — 1) 
(ti(fe-i) + 1) 



mxm 



(to — 1) 



We note that while the vectors Yi or their shifted versions are of different 
length, we take the last to elements of these vectors to form the to x to 
matrix Z. 



4 Complexity Analysis 

The formula of the vector V in Equation (13) includes the shifted versions of 
which finally reduce to the shifted versions of the row L® when the recursive 
formula is used. Since all right-shifted versions of the row = Mm are present 
in the original multiplication matrix M, it is possible to represent the vector V 
as a sum of rows of this matrix. Except for the row itself, the minimum 
shift is equal to A. Thus, after cancellations, the indices of the rows will be a 
subset S of the set of indices 

TO, (to -I- Z\), (to -I- Z\ -I- 1), . . . , (2to — 3), (2to — 2) . 

The first row with the smallest index can be used as a base for the addition, 
and the rest of the rows will be added to this row. Thus, the actual subset to 
be added to this base vector is {S — min 5), which will be called as S*. Since 
the row Mj has exactly (2 to — 1 — j) nonzero terms for 2m — 2 > j > to, 
and adding each nonzero term requires a single XOR gate, the total number 
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of XOR gates required to compute the first form of V will be equal to the 
sum ~ ~ j)- The delay in the computation of the vector V can be 

minimized when the binary tree method is used to compute the summation in 
each entry. Since there are at most |5| items to be added to compute any entry, 
the delay of the computation will be |"log 2 |5|] Tx, where |5| denotes the order 
of the set S. When the recursive relations in Equations (15) and (16) are used, 
the construction of Zq requires only rewiring. We then construct by adding 
the vector V to the vector formed by concatenation, which requires (m — 1) 
XOR gates since the vector V has only (m — 1) nonzero terms. Thus, we need 
{k — l){m — 1) XOR gates to compute all Z„, for i = 1,2, . . . ,{k — 1). Since 
the time needed to compute a single row Z„^ is Tx, the total delay to compute 
all rows is (fc — l)Tx- The vectors YO and Y1 are then found using Equation 
(17) by rewiring. The construction of Z in Equation (18) is also performed using 
rewiring since it consists of shifts. 

To find the final result c{x) via the product c = Zb, we also need AND 
gates and m{m — 1) XOR gates. Each coefficient of the final result c{x) can be 
computed independently via the product = Z^b. All multiplications can be 
done in one level, and the m terms can be added using the binary tree method in 
[log 2 rn \ Tx time. Therefore, the entire computation for the general case requires 

AND gates and 



(m — l)(m + /c — 1) + ^ (2m — 1 — j) (19) 

jes- 

XOR gates. The total delay of the circuit is given as 

Ta + ( [log 2 | 5|1 + (fc - 1) + [log 2 ml ) Tx . (20) 

5 Conclusions 

In this paper, we present a new architecture for the Mastrovito multiplication 
and rigorous analysis of the complexity for a general irreducible polynomial. 
In this paper, we give a rigorous analysis of the Mastrovito multiplier for a 
general irreducible polynomial, and show that it requires m^ AND gates and 
(m- l)(m+fc- 1) + X;jg5* (2m — 1 — j) XOR gates, where S* is defined in §4. 
In the full version of this paper [2], we extend this analysis to certain types of 
irreducible polynomials. These results are summarized in Table 1. 

Table 1: The XOR complexity results for the Mastrovito multiplier. 



Polynomial 


XOR Complexity 


Reference 


Trinomial 


m^ — 1 


[7] [8] [12] [13] [14] [2] 


EST 




[14] [2] 


AOP 


m^ — 1 


[5] [2] 


General 


(m - l)(m + fc - 1) + “ 1 “ j) 


§4 


ESP 


m^ — A 


[2] 
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